cluster.preprocess package

Submodules

cluster.preprocess.pre_node module

class cluster.preprocess.pre_node.PreProcessNode[source]

Bases: cluster.common.common_node.WorkFlowCommonNode

load_data(node_id, parm='all')[source]
run(conf_data)[source]

cluster.preprocess.pre_node_feed module

class cluster.preprocess.pre_node_feed.PreNodeFeed[source]

Bases: cluster.preprocess.pre_node.PreProcessNode

Error check rule add : Dataconf Add

data_size()[source]
file_size()[source]
Returns:
has_next()[source]

check if hdf5 file pointer has next :return:

next()[source]

move pointer +1 :return:

reset_pointer()[source]

check if hdf5 file pointer has next :return:

run(conf_data)[source]

cluster.preprocess.pre_node_feed_fr2attn module

class cluster.preprocess.pre_node_feed_fr2attn.PreNodeFeedFr2Attn[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]

get data array size of this calss :return:

has_next()[source]

check if hdf5 file pointer has next :return:

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_fr2auto module

class cluster.preprocess.pre_node_feed_fr2auto.PreNodeFeedFr2Auto[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]

get data array size of this calss :return:

has_next()[source]

check if hdf5 file pointer has next :return:

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_fr2cnn module

class cluster.preprocess.pre_node_feed_fr2cnn.PreNodeFeedFr2Cnn[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_fr2seq module

class cluster.preprocess.pre_node_feed_fr2seq.PreNodeFeedFr2Seq[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]

get data array size of this calss :return:

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_fr2wcnn module

class cluster.preprocess.pre_node_feed_fr2wcnn.PreNodeFeedFr2Wcnn[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]

get data array size of this calss :return:

has_next()[source]

check if hdf5 file pointer has next :return:

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_fr2wdnn module

class cluster.preprocess.pre_node_feed_fr2wdnn.PreNodeFeedFr2Wdnn[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

add_none_keys_cate_conti_list(conti_list, cate_list)[source]

Example 을 위한 Continuous 랑 Categorical을 구분하기 위한 list

create_feature_columns(dataconf=None)[source]

Get feature columns for tfrecord reader TFRecord에서 feature를 추출

data_size()[source]
input_fn(mode, data_file, batch_size, dataconf=None)[source]
input_fn2(mode, data_file, df, dataconf)[source]

Wide & Deep Network input tensor maker V1.0 16.11.04 Initial

:param df : dataframe from hbase :param df, nnid :return: tensor sparse, constraint
make_continuous_category_list(cell_feature)[source]

Example 을 위한 Continuous 랑 Categorical을 구분하기 위한 list

multi_queue_and_h5_print(file_name)[source]
run(conf_data)[source]

override init class

set_for_predict(nnid=None)[source]

cluster.preprocess.pre_node_feed_fr2wv module

class cluster.preprocess.pre_node_feed_fr2wv.PreNodeFeedFr2Wv[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]

get data array size of this calss :return:

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_img2auto module

class cluster.preprocess.pre_node_feed_img2auto.PreNodeFeedImg2Auto[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]
run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_img2cnn module

class cluster.preprocess.pre_node_feed_img2cnn.PreNodeFeedImg2Cnn[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

run(conf_data)[source]

override init class

size()[source]

cluster.preprocess.pre_node_feed_img2renet module

class cluster.preprocess.pre_node_feed_img2renet.PreNodeFeedImg2Renet[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]
run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_iob2bilstmcrf module

class cluster.preprocess.pre_node_feed_iob2bilstmcrf.PreNodeFeedIob2BiLstmCrf[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

get_file_name()[source]

get file name of current file pointer :return:

run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_keras2frame module

class cluster.preprocess.pre_node_feed_keras2frame.PreNodeFeedKerasFrame[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

pre_feed_keras2frame

class LabelEncoder

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Encode labels with value between 0 and n_classes-1.

Read more in the User Guide.

classes_
: array of shape (n_class,)
Holds the label for each class.

LabelEncoder can be used to normalize labels.

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6]) 
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])

It can also be used to transform non-numerical labels (as long as they are hashable and comparable) to numerical labels.

>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"]) 
array([2, 2, 1]...)
>>> list(le.inverse_transform([2, 2, 1]))
['tokyo', 'tokyo', 'paris']
sklearn.preprocessing.OneHotEncoder
: encode categorical integer features
using a one-hot aka one-of-K scheme.
fit(y)

Fit label encoder

y
: array-like of shape (n_samples,)
Target values.

self : returns an instance of self.

fit_transform(y)

Fit label encoder and return encoded labels

y
: array-like of shape [n_samples]
Target values.

y : array-like of shape [n_samples]

inverse_transform(y)

Transform labels back to original encoding.

y
: numpy array of shape [n_samples]
Target values.

y : numpy array of shape [n_samples]

transform(y)

Transform labels to normalized encoding.

y
: array-like of shape [n_samples]
Target values.

y : array-like of shape [n_samples]

PreNodeFeedKerasFrame.add_none_keys_cate_conti_list(conti_list, cate_list)[source]

Example 을 위한 Continuous 랑 Categorical을 구분하기 위한 list

PreNodeFeedKerasFrame.create_feature_columns(dataconf=None)[source]

Get feature columns for tfrecord reader TFRecord에서 feature를 추출

PreNodeFeedKerasFrame.data_size()[source]
PreNodeFeedKerasFrame.dummyEncode(CATEGORICAL_COLUMNS, df)[source]
PreNodeFeedKerasFrame.input_fn(mode, data_file, batch_size, dataconf=None)[source]
PreNodeFeedKerasFrame.input_fn2(mode, data_file, df, dataconf)[source]

Wide & Deep Network input tensor maker V1.0 16.11.04 Initial

:param df : dataframe from hbase :param df, nnid :return: tensor sparse, constraint
PreNodeFeedKerasFrame.input_fn3(data_file, df, dataconf)[source]

Wide & Deep Network input tensor maker V1.0 16.11.04 Initial

:param df : dataframe from hbase :param df, nnid :return: tensor sparse, constraint
PreNodeFeedKerasFrame.make_continuous_category_list(cell_feature)[source]

Example 을 위한 Continuous 랑 Categorical을 구분하기 위한 list

PreNodeFeedKerasFrame.multi_queue_and_h5_print(file_name)[source]
PreNodeFeedKerasFrame.run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_text2dv module

class cluster.preprocess.pre_node_feed_text2dv.PreNodeFeedText2Dv[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]
run(conf_data)[source]

override init class

train_file_path()[source]

cluster.preprocess.pre_node_feed_text2fasttext module

class cluster.preprocess.pre_node_feed_text2fasttext.PreNodeFeedText2FastText[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]
run(conf_data)[source]

override init class

cluster.preprocess.pre_node_feed_text2seq module

class cluster.preprocess.pre_node_feed_text2seq.PreNodeFeedText2Seq[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]
has_next()[source]

check if hdf5 file pointer has next :return:

len()[source]
Returns:
next()[source]

move pointer +1 :return:

run(conf_data)[source]
Parameters:conf_data
Returns:

cluster.preprocess.pre_node_feed_text2wv module

class cluster.preprocess.pre_node_feed_text2wv.PreNodeFeedText2Wv[source]

Bases: cluster.preprocess.pre_node_feed.PreNodeFeed

data_size()[source]
run(conf_data)[source]

override init class

cluster.preprocess.pre_node_merge_text2seq module

class cluster.preprocess.pre_node_merge_text2seq.PreNodeMergeText2Seq[source]

Bases: cluster.preprocess.pre_node.PreProcessNode

load_data(node_id, parm='all')[source]

load train data :param node_id: :param parm: :return:

run(conf_data)[source]

cluster.preprocess.pre_node_prenet module

class cluster.preprocess.pre_node_prenet.PreProcessNodePreNet[source]

Bases: cluster.preprocess.pre_node.PreProcessNode

run(conf_data)[source]

Module contents