cluster.common package¶
Submodules¶
cluster.common.common_node module¶
-
class
cluster.common.common_node.
WorkFlowCommonNode
[source]¶ Bases:
object
wdnn을 위한 load data를 위한 빈 메소드 생성
-
decode_pad
(input_list, max_len=0, pad_char='#', start_char='@')[source]¶ [pad_char] * pad_len + input :param pos: :return:
-
find_next_node
(node_name, node_list)[source]¶ find next node and return name :param node_name: :param node_list: :return:
-
find_prev_node
(node_name, node_list)[source]¶ find prev node and return name :param node_name: :param node_list: :return:
-
get_linked_next_node_with_type
(type)[source]¶ get linked node forward with type bug fix prev node to next node :param type: :return:
-
get_linked_prev_node_with_cond
(val, cond='has_value')[source]¶ get linked node prev until find node which have specific parm :param type: :return:
-
get_linked_prev_node_with_type
(type)[source]¶ get linked node forward with type :param type: :return:
-
cluster.common.neural_common_bilismcrf module¶
-
class
cluster.common.neural_common_bilismcrf.
BiLstmCommon
[source]¶ Bases:
object
common functions for bilstm crf
-
class
CoNLLDataset
(filename, processing_word=None, processing_tag=None, max_iter=None, all_line=True)[source]¶ Bases:
object
Class that iterates over CoNLL Dataset
-
BiLstmCommon.
NONE
= 'O'¶
-
BiLstmCommon.
NUM
= '$NUM$'¶
-
BiLstmCommon.
UNK
= '$UNK$'¶
-
BiLstmCommon.
export_trimmed_glove_vectors
(vocab, model, trimmed_filename)[source]¶ Saves glove vectors in numpy array
- Args:
- vocab: dictionary vocab[word] = index glove_filename: a path to a glove file trimmed_filename: a path where to store a matrix in npy dim: (int) dimension of embeddings UNK = “$UNK$” NUM = “$NUM$” NONE = “O”
-
BiLstmCommon.
get_char_vocab
(dataset, chars=None)[source]¶ - Args:
- dataset: a iterator yielding tuples (sentence, tags)
- Returns:
- a set of all the characters in the dataset
-
BiLstmCommon.
get_chunks
(seq, tags)[source]¶ - Args:
- seq: [4, 4, 0, 0, ...] sequence of labels tags: dict[“O”] = 4
- Returns:
- list of (chunk_type, chunk_start, chunk_end)
- Example:
- seq = [4, 5, 0, 3] tags = {“B-PER”: 4, “I-PER”: 5, “B-LOC”: 3} result = [(“PER”, 0, 2), (“LOC”, 3, 4)]
-
BiLstmCommon.
get_processing_word
(vocab_words=None, vocab_chars=None, lowercase=False, chars=False)[source]¶ - Args:
- vocab: dict[word] = idx
- Returns:
- f(“cat”) = ([12, 4, 32], 12345)
- = (list of char ids, word id)
-
BiLstmCommon.
get_trimmed_glove_vectors
(filename)[source]¶ - Args:
- filename: path to the npz file
- Returns:
- matrix of embeddings (np array)
-
BiLstmCommon.
get_vocabs
(datasets, vocab=None, tags=None)[source]¶ - Args:
- datasets: a list of dataset objects
- Return:
- a set of all the words in the dataset
-
BiLstmCommon.
load_vocab
(filename)[source]¶ - Args:
- filename: file with a word per line
- Returns:
- d: dict[word] = index
-
BiLstmCommon.
minibatches
(data, minibatch_size)[source]¶ - Args:
- data: generator of (sentence, tags) tuples minibatch_size: (int)
- Returns:
- list of tuples
-
BiLstmCommon.
pad_sequences
(sequences, pad_tok, nlevels=1)[source]¶ - Args:
- sequences: a generator of list or tuple pad_tok: the char to pad with
- Returns:
- a list of list where each sublist has same length
-
class