DataSource¶
- class towhee.data_loader.DataLoader(data_source: Union[Iterable, Callable], parser: Optional[Callable] = None, batch_size: Optional[int] = None)[source]¶
Bases:
object
- Parameters:
data_source (Uniton[Iterable, Callable]) – Read data from a data_source (can be an iterable or a callable)
parser (Callable) – Parse the read data through the parser function to get the input that the pipeline can process.
batch_size (int) – If batch_size is specified, batch the read data into batches of size batch_size, otherwise yield single data directly
Examples
>>> from towhee import DataLoader, pipe, ops >>> p = pipe.input('num').map('num', 'ret', lambda x: x + 1).output('ret') >>> for data in DataLoader([{'num': 1}, {'num': 2}, {'num': 3}], parser=lambda x: x['num']): >>> print(p(data).to_list(kv_format=True)) [{'ret': 2}] [{'ret': 3}] [{'ret': 4}]
- __init__(data_source: Union[Iterable, Callable], parser: Optional[Callable] = None, batch_size: Optional[int] = None)[source]¶
- __dict__ = mappingproxy({'__module__': 'towhee.data_loader', '__doc__': "\n DataLoader\n\n Args:\n data_source (`Uniton[Iterable, Callable]`)\n Read data from a data_source (can be an iterable or a callable)\n\n parser (`Callable`):\n Parse the read data through the parser function to get the input that the pipeline can process.\n\n batch_size (`int`)\n If batch_size is specified, batch the read data into batches of size batch_size, otherwise yield single data directly\n\n Examples:\n >>> from towhee import DataLoader, pipe, ops\n >>> p = pipe.input('num').map('num', 'ret', lambda x: x + 1).output('ret')\n >>> for data in DataLoader([{'num': 1}, {'num': 2}, {'num': 3}], parser=lambda x: x['num']):\n >>> print(p(data).to_list(kv_format=True))\n [{'ret': 2}]\n [{'ret': 3}]\n [{'ret': 4}]\n ", '__init__': <function DataLoader.__init__>, '_batcher': <function DataLoader._batcher>, '_single': <function DataLoader._single>, '__iter__': <function DataLoader.__iter__>, '__dict__': <attribute '__dict__' of 'DataLoader' objects>, '__weakref__': <attribute '__weakref__' of 'DataLoader' objects>, '__annotations__': {}})¶
- __module__ = 'towhee.data_loader'¶
- __weakref__¶
list of weak references to the object (if defined)