towhee.functional.mixins.dataframe.DataFrameMixin¶
- class towhee.functional.mixins.dataframe.DataFrameMixin[source]¶
Bases:
object
Mixin to help deal with Entity.
Examples:
define an operator with register decorator
>>> from towhee import register >>> from towhee import DataFrame >>> @register ... def add_1(x): ... return x+1
apply the operator to named field of entity and save result to another named field
>>> ( ... DataFrame([dict(a=1, b=2), dict(a=2, b=3)]) ... .as_entity() ... .add_1['a', 'c']() # <-- use field `a` as input and filed `c` as output ... .as_str() ... .to_list() ... ) ["{'a': 1, 'b': 2, 'c': 2}", "{'a': 2, 'b': 3, 'c': 3}"]
Select the entity on the specified fields.
Examples:
Select the entity on one specified field:
>>> from towhee import Entity >>> from towhee import DataFrame >>> df = DataFrame([Entity(a=i, b=i, c=i) for i in range(2)]) >>> df.select['a']().to_list() [<Entity dict_keys(['a'])>, <Entity dict_keys(['a'])>]
Select multiple fields and unpack the entity:
>>> ( ... DataFrame([Entity(a=i, b=i, c=i) for i in range(5)]) ... .select['a', 'b']() ... .as_raw() ... .to_list() ... ) [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
Another field selection syntax (not suggested):
>>> ( ... DataFrame([Entity(a=i, b=i, c=i) for i in range(5)]) ... .select('a', 'b') ... .as_raw() ... .to_list() ... ) [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
Methods
Convert elements into Entities.
Convert entities to json
Convert entitis into raw python values
Drop entities that contain some specific values.
When DataFrame's iterable exists of Entities and some indexes missing, fill default value for those indexes.
Parse string to entities.
Rename an column in DataFrame.
Replace specific attributes with given vlues.
Attributes
df
Select columns from a DC.
- as_entity(schema: Optional[List[str]] = None)[source]¶
Convert elements into Entities.
- Parameters:
schema (Optional[List[str]]) – schema contains field names.
Examples: 1. convert dicts into entities:
>>> from towhee import DataFrame >>> ( ... DataFrame([dict(a=1, b=2), dict(a=2, b=3)]) ... .as_entity() ... .as_str() ... .to_list() ... ) ["{'a': 1, 'b': 2}", "{'a': 2, 'b': 3}"]
convert tuples into entities:
>>> from towhee import DataFrame >>> ( ... DataFrame([(1, 2), (2, 3)]) ... .as_entity(schema=['a', 'b']) ... .as_str() ... .to_list() ... ) ["{'a': 1, 'b': 2}", "{'a': 2, 'b': 3}"]
convert single value into entities:
>>> from towhee import DataFrame >>> ( ... DataFrame([1, 2]) ... .as_entity(schema=['a']) ... .as_str() ... .to_list() ... ) ["{'a': 1}", "{'a': 2}"]
- as_json()[source]¶
Convert entities to json
Examples:
>>> from towhee import DataFrame, Entity >>> ( ... DataFrame([Entity(x=1)]) ... .as_json() ... ) ['{"x": 1}']
- as_raw()[source]¶
Convert entitis into raw python values
Examples:
unpack multiple values from entities:
>>> from towhee import DataFrame >>> ( ... DataFrame([(1, 2), (2, 3)]) ... .as_entity(schema=['a', 'b']) ... .as_raw() ... .to_list() ... ) [(1, 2), (2, 3)]
unpack single value from entities:
>>> ( ... DataFrame([1, 2]) ... .as_entity(schema=['a']) ... .as_raw() ... .to_list() ... ) [1, 2]
- dropna(na: Set[str] = {'', None}) Union[bool, DataFrame] [source]¶
Drop entities that contain some specific values.
- Parameters:
na (Set[str]) – Those entities contain values in na will be dropped.
Examples:
>>> from towhee import Entity, DataFrame >>> entities = [Entity(a=i, b=i + 1) for i in range(3)] >>> entities.append(Entity(a=3, b='')) >>> df = DataFrame(entities) >>> df [<Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>]
>>> df.dropna() [<Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>]
- fill_entity(_DefaultKVs: Optional[Dict[str, Any]] = None, _ReplaceNoneValue: bool = False, **kws)[source]¶
When DataFrame’s iterable exists of Entities and some indexes missing, fill default value for those indexes.
- Parameters:
_ReplaceNoneValue (bool) – Whether to replace None in Entity’s value.
_DefaultKVs (Dict[str, Any]) – The key-value pairs stored in a dict.
Examples:
>>> from towhee import Entity, DataFrame >>> entities = [Entity(num=i) for i in range(3)] >>> df = DataFrame(entities) >>> df [<Entity dict_keys(['num'])>, <Entity dict_keys(['num'])>, <Entity dict_keys(['num'])>]
>>> kvs = {'foo': 'bar'} >>> df.fill_entity(kvs).fill_entity(usage='test').to_list() [<Entity dict_keys(['num', 'foo', 'usage'])>, <Entity dict_keys(['num', 'foo', 'usage'])>, <Entity dict_keys(['num', 'foo', 'usage'])>]
>>> kvs = {'FOO': None} >>> df.fill_entity(_ReplaceNoneValue=True, _DefaultKVs=kvs).to_list()[0].FOO 0
- parse_json()[source]¶
Parse string to entities.
Examples:
>>> from towhee import DataFrame >>> df = ( ... DataFrame(['{"x": 1}']) ... .parse_json() ... ) >>> df[0].x 1
- rename(column: Dict[str, str])[source]¶
Rename an column in DataFrame.
- Parameters:
column (Dict[str, str]) – The columns to rename and their corresponding new name.
Examples:
>>> from towhee import Entity, DataFrame >>> entities = [Entity(a=i, b=i + 1) for i in range(3)] >>> df = DataFrame(entities) >>> df [<Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>, <Entity dict_keys(['a', 'b'])>]
>>> df.rename(column={'a': 'A', 'b': 'B'}) [<Entity dict_keys(['A', 'B'])>, <Entity dict_keys(['A', 'B'])>, <Entity dict_keys(['A', 'B'])>]
- replace(**kws)[source]¶
Replace specific attributes with given vlues.
Examples:
>>> from towhee import Entity, DataFrame
>>> entities = [Entity(num=i) for i in range(5)] >>> df = DataFrame(entities) >>> [i.num for i in df] [0, 1, 2, 3, 4]
>>> df = df.replace(num={0: 1, 1: 2, 2: 3, 3: 4, 4: 5}) >>> [i.num for i in df] [1, 2, 3, 4, 5]
- property select¶
Select columns from a DC.
Examples:
>>> from towhee import Entity, DataFrame >>> entities = [Entity(a=i, b=i, c=i) for i in range(3)] >>> dc = DataFrame(entities) >>> dc.select('a') [<Entity dict_keys(['a'])>, <Entity dict_keys(['a'])>, <Entity dict_keys(['a'])>]