towhee.dag.graph_repr.GraphRepr¶
- class towhee.dag.graph_repr.GraphRepr(name: str, graph_type: str, op_reprs: Dict[str, OperatorRepr], df_reprs: Dict[str, DataFrameRepr], ir: Optional[str] = None)[source]¶
Bases:
BaseRepr
A GraphRepr presents a complete DAG.
A graph contains individual subcomponents, including Operators, Dataframes, and Variables. Graph representations are used during execution to load functions and pass data to the correct operators.
- Parameters:
name (str) – The representation name.
file_or_url (str) – The file or remote url that stores the information of this representation.
Methods
Depth-First Search the graph.
Generate a GraphRepr from a description dict.
Import a YAML file describing this graph. Example YAML look like this: name: 'test_graph' operators: - name: 'test_op_1' function: 'test_function' inputs: - df: 'test_df_1' col: 0 outputs: - df: 'test_df_2' col: 0 iter_info: type: map dataframes: - name: 'test_df_1' columns: - vtype: 'int' name: 'test_df_2' columns: - vtype: 'int'.
Get the isolated dataframe(s) in the DAG.
Get the isolated operator(s) in the DAG.
Get the loop(s) inside the graph.
inject_template
Check if the src is a valid YAML file to describe a component in Towhee.
Load the representation(s) information from a local YAML file.
Load the information for the representation.
Load the representation(s) information from a YAML file (pre-loaded as string).
Load the representation information from a remote YAML file.
render_template
Export a YAML file describing this graph.
Attributes
dataframes
graph_type
ir
name
operators
- __init__(name: str, graph_type: str, op_reprs: Dict[str, OperatorRepr], df_reprs: Dict[str, DataFrameRepr], ir: Optional[str] = None)[source]¶
- static dfs(cur: str, adj: Dict[str, List[str]], flag: Dict[str, int], cur_list: List[str]) Tuple[bool, List[str]] [source]¶
Depth-First Search the graph.
- Parameters:
cur (str) – The name of current dataframe.
adj (Dict[str, List[str]]) – A dict store the adjacent dataframe of each dataframe.
flag (Dict[str, int]) – A dict store the status of the columns. - 0 means the dataframe has not been visted yet. - 1 means the dataframe has been visted in this search, which means this columns is part of a loop. - 2 means the dataframe has been searched and confirmed not to be a part of a loop.
cur_list (List[str]) – The list of dataframe that have been visited in this search.
- Returns:
- (Tuple[bool, List[str]])
Return False if there is no loop, else True and the loop.
- static from_dict(info: Dict[Any, Any]) GraphRepr [source]¶
Generate a GraphRepr from a description dict.
- Parameters:
info (Dict[Any, Any]) – A dict to describe the DAG.
- Returns:
- (towhee.dag.GraphRepr)
The GraphRepr obj.
- static from_yaml(src: str)[source]¶
Import a YAML file describing this graph. Example YAML look like this: name: ‘test_graph’
- operators:
name: ‘test_op_1’ function: ‘test_function’ inputs:
df: ‘test_df_1’ col: 0
- outputs:
df: ‘test_df_2’ col: 0
- iter_info:
type: map
- dataframes:
name: ‘test_df_1’ columns:
vtype: ‘int’
name: ‘test_df_2’ columns:
vtype: ‘int’
- Parameters:
src (str) – YAML file (could be pre-loaded as string) to import.
- Returns:
- (towhee.dag.GraphRepr)
The GraphRepr object.
- get_isolated_df() Set[str] [source]¶
Get the isolated dataframe(s) in the DAG.
- Returns:
- (Set[str])
Return the isolated dataframe set if exists, else an empty set.
- get_isolated_op() Set[str] [source]¶
Get the isolated operator(s) in the DAG.
- Returns:
- (Set[str])
Return the isolated operator set if exists, else an empty set.
- get_loop() List[str] [source]¶
Get the loop(s) inside the graph.
- Returns:
- (List[str])
Return the loop if exists, else an empty list.
- static is_valid(info: Dict[str, Any], essentials: Set[str]) bool ¶
Check if the src is a valid YAML file to describe a component in Towhee.
- Parameters:
info (Dict[str, Any]) – The dict loaded from the source file.
essentials (Set[str]) – The essential keys that a valid YAML file should contain.
- Returns:
- (bool)
Return True if the src file is a valid YAML file to describe a component in Towhee, else False.
- static load_file(file: str) dict ¶
Load the representation(s) information from a local YAML file.
- Parameters:
file (str) – The file path.
- Returns:
- (dict)
The dict loaded from the YAML file that contains the representation information.
- static load_src(file_or_src: str) dict ¶
Load the information for the representation. We support file from local file/HTTP/HDFS.
- Parameters:
file_or_src (str) – The source YAML file or the URL points to the source file or a str loaded from source file.
- Returns:
- (dict)
The YAML file loaded as dict.
- static load_str(string: str) dict ¶
Load the representation(s) information from a YAML file (pre-loaded as string).
- Parameters:
string (str) – The string pre-loaded from a YAML.
- Returns:
- (dict)
The dict loaded from the YAML file that contains the representation information.
- static load_url(url: str) dict ¶
Load the representation information from a remote YAML file.
- Parameters:
url (str) – The url points to the remote YAML file.
- Returns:
- (dict)
The dict loaded from the YAML file that contains the representation information.