DataSource
- class towhee.data_loader.DataLoader(data_source: Iterable | Callable, parser: Callable | None = None, batch_size: int | None = None)[source]
Bases:
object
- Args:
- data_source (Uniton[Iterable, Callable])
Read data from a data_source (can be an iterable or a callable)
- parser (Callable):
Parse the read data through the parser function to get the input that the pipeline can process.
- batch_size (int)
If batch_size is specified, batch the read data into batches of size batch_size, otherwise yield single data directly
- Examples:
>>> from towhee import DataLoader, pipe, ops >>> p = pipe.input('num').map('num', 'ret', lambda x: x + 1).output('ret') >>> for data in DataLoader([{'num': 1}, {'num': 2}, {'num': 3}], parser=lambda x: x['num']): >>> print(p(data).to_list(kv_format=True)) [{'ret': 2}] [{'ret': 3}] [{'ret': 4}]
- __init__(data_source: Iterable | Callable, parser: Callable | None = None, batch_size: int | None = None)[source]
- __dict__ = mappingproxy({'__module__': 'towhee.data_loader', '__doc__': "\n DataLoader\n\n Args:\n data_source (`Uniton[Iterable, Callable]`)\n Read data from a data_source (can be an iterable or a callable)\n\n parser (`Callable`):\n Parse the read data through the parser function to get the input that the pipeline can process.\n\n batch_size (`int`)\n If batch_size is specified, batch the read data into batches of size batch_size, otherwise yield single data directly\n\n Examples:\n >>> from towhee import DataLoader, pipe, ops\n >>> p = pipe.input('num').map('num', 'ret', lambda x: x + 1).output('ret')\n >>> for data in DataLoader([{'num': 1}, {'num': 2}, {'num': 3}], parser=lambda x: x['num']):\n >>> print(p(data).to_list(kv_format=True))\n [{'ret': 2}]\n [{'ret': 3}]\n [{'ret': 4}]\n ", '__init__': <function DataLoader.__init__>, '_batcher': <function DataLoader._batcher>, '_single': <function DataLoader._single>, '__iter__': <function DataLoader.__iter__>, '__dict__': <attribute '__dict__' of 'DataLoader' objects>, '__weakref__': <attribute '__weakref__' of 'DataLoader' objects>, '__annotations__': {}})
- __module__ = 'towhee.data_loader'
- __weakref__
list of weak references to the object (if defined)