Feature Engineer

class towhee.hub.builtin.operators.feature_engineer.standard_scaler(name: Optional[str] = None)[source]

Standardize numerical features by removing the mean and scaling to unit variance.

Examples:

>>> from towhee import DataCollection, Entity
>>> dc = (
...     DataCollection.range(10).map(lambda x: Entity(a=x))
...         .set_training()
...         .standard_scaler['a', 'b'](name='standard_scaler')
... )
>>> [int(x.b*10) for x in dc.to_list()]
[-15, -12, -8, -5, -1, 1, 5, 8, 12, 15]
class towhee.hub.builtin.operators.feature_engineer.num_discretizer(name: Optional[str] = None, n_bins=10, encode='onehot', strategy='quantile')[source]

Bin numerical features into intervals.

Examples:

>>> from towhee import DataCollection, Entity
>>> dc = (
...     DataCollection.range(10).map(lambda x: Entity(a=x))
...         .set_training()
...         .num_discretizer['a', 'b'](name='discretizer', n_bins=3)
... )
class towhee.hub.builtin.operators.feature_engineer.cate_one_hot_encoder(name: Optional[str] = None)[source]

Standardize numerical features by removing the mean and scaling to unit variance.

Examples:

>>> from towhee import DataCollection, Entity
>>> dc = (
...     DataCollection(['a','b','c','a','b']).map(lambda x: Entity(a=x))
...         .set_training()
...         .cate_one_hot_encoder['a', 'b'](name='one_hot_encoder')
... )
>>> [x.b.nonzero()[1][0] for x in dc.to_list()]
[0, 1, 2, 0, 1]