DataSource

class towhee.data_loader.DataLoader(data_source: Union[Iterable, Callable], parser: Optional[Callable] = None, batch_size: Optional[int] = None)[source]

Bases: object

Parameters:
  • data_source (Uniton[Iterable, Callable]) – Read data from a data_source (can be an iterable or a callable)

  • parser (Callable) – Parse the read data through the parser function to get the input that the pipeline can process.

  • batch_size (int) – If batch_size is specified, batch the read data into batches of size batch_size, otherwise yield single data directly

Examples

>>> from towhee import DataLoader, pipe, ops
>>> p = pipe.input('num').map('num', 'ret', lambda x: x + 1).output('ret')
>>> for data in DataLoader([{'num': 1}, {'num': 2}, {'num': 3}], parser=lambda x: x['num']):
>>>     print(p(data).to_list(kv_format=True))
[{'ret': 2}]
[{'ret': 3}]
[{'ret': 4}]
__init__(data_source: Union[Iterable, Callable], parser: Optional[Callable] = None, batch_size: Optional[int] = None)[source]
__iter__()[source]
__dict__ = mappingproxy({'__module__': 'towhee.data_loader', '__doc__': "\n    DataLoader\n\n    Args:\n        data_source (`Uniton[Iterable, Callable]`)\n            Read data from a data_source (can be an iterable or a callable)\n\n        parser (`Callable`):\n            Parse the read data through the parser function to get the input that the pipeline can process.\n\n        batch_size (`int`)\n            If batch_size is specified, batch the read data into batches of size batch_size, otherwise yield single data directly\n\n    Examples:\n        >>> from towhee import DataLoader, pipe, ops\n        >>> p = pipe.input('num').map('num', 'ret', lambda x: x + 1).output('ret')\n        >>> for data in DataLoader([{'num': 1}, {'num': 2}, {'num': 3}], parser=lambda x: x['num']):\n        >>>     print(p(data).to_list(kv_format=True))\n        [{'ret': 2}]\n        [{'ret': 3}]\n        [{'ret': 4}]\n    ", '__init__': <function DataLoader.__init__>, '_batcher': <function DataLoader._batcher>, '_single': <function DataLoader._single>, '__iter__': <function DataLoader.__iter__>, '__dict__': <attribute '__dict__' of 'DataLoader' objects>, '__weakref__': <attribute '__weakref__' of 'DataLoader' objects>, '__annotations__': {}})
__module__ = 'towhee.data_loader'
__weakref__

list of weak references to the object (if defined)