Skip to content

neuralconvo.ini + missing data/train.enc #3

@johndpope

Description

@johndpope

the neuralconvo.ini specifies following files

[strings]

Mode : train, test, serve

mode = train
train_enc = data/train.enc
train_dec = data/train.dec
test_enc = data/test.enc
test_dec = data/test.enc

but there is no data folder in repo.
there is the working_dir

python3 execute.py

Mode : train

Preparing data in working_dir/
Tokenizing data in data/train.enc
Traceback (most recent call last):
File "execute.py", line 313, in
train()
File "execute.py", line 127, in train
enc_train, dec_train, enc_dev, dec_dev, _, _ = data_utils.prepare_custom_data(gConfig['working_directory'],gConfig['train_enc'],gConfig['train_dec'],gConfig['test_enc'],gConfig['test_dec'],gConfig['enc_vocab_size'],gConfig['dec_vocab_size'])
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 137, in prepare_custom_data
data_to_token_ids(train_enc, enc_train_ids_path, enc_vocab_path, tokenizer)
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 121, in data_to_token_ids
normalize_digits)
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 100, in sentence_to_token_ids
words = basic_tokenizer(sentence)
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 50, in basic_tokenizer
words.extend(re.split(_WORD_SPLIT, space_separated_fragment))
File "/usr/local/Cellar/python3/3.5.2_1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/re.py", line 203, in split
return _compile(pattern, flags).split(string, maxsplit)
TypeError: cannot use a bytes pattern on a string-like object

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions