Quantcast
Channel: Active questions tagged ubuntu - Stack Overflow
Viewing all articles
Browse latest Browse all 6025

FileNotFoundError: [Errno 2] No such file or directory: '/home/nicotetio/Desktop/Data_Engineering/DataPreparation/reviewDataCleaned.csv'

$
0
0

I'm trying to build a ML model with docker. I created 4 files : docker-ml.py, docker-ml-inference.py, requirements.txt and the DockerfileI'm using the IMDB Dataset of 50k Movie Reviews, after cleaning it I saved it in a new csv file call reviewDataCleaned.csv

Now I want to build the image by tap docker build . (all my files are in the same Folder), but I got this error :enter image description here

Can you help me please?

import jsonimport osimport pandas as pdfrom joblib import dumpimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_splitfrom sklearn.feature_extraction.text import TfidfVectorizerfrom sklearn.linear_model import LogisticRegressionfrom sklearn.pipeline import Pipelinefrom sklearn.metrics import confusion_matrixfrom sklearn.metrics import accuracy_score,precision_score,recall_scoreMODEL_DIR = os.environ["MODEL_DIR"]MODEL_FILE = os.environ["MODEL_FILE"]METADATA_FILE = os.environ["METADATA_FILE"]MODEL_PATH = os.path.join(MODEL_DIR, MODEL_FILE)METADATA_PATH = os.path.join(MODEL_DIR, METADATA_FILE)print("Loading dataset...")reviewDataCleaned = pd.read_csv('reviewDataCleaned.csv')print("splitting the data")independent_var = reviewDataCleaned.cleaned_reviewdependant_var = reviewDataCleaned.sentimentX_train, X_test, y_train, y_test = train_test_split(independent_var, dependant_var, test_size=0.20, random_state=0)#Define TfidfVectorizervectorizerTF = TfidfVectorizer()#Define classifierclf2 = LogisticRegression(solver='lbfgs')#Pipeline vectorizer and then classifierclf = Pipeline([('vectorizer', vectorizerTF), ('classifier', clf2)])print("fitting model")clf.fit(X_train, y_train)#computing meta datapredictions = clf.predict(X_test)accuracy = accuracy_score(predictions, y_test)precision = precision_score(predictions, y_test, average='weighted')Recall = recall_score(predictions, y_test, average='weighted')metadata = {'accuracy_score' : accuracy,'precision_score' : precision,'recall_score' : Recall    }print("Serializing model to: {}".format(MODEL_PATH))dump(clf, MODEL_PATH)print("Saving metadata to: {}".format(METADATA_PATH))with open(METADATA_PATH, 'w') as output:    json.dump(metadata, output)

Viewing all articles
Browse latest Browse all 6025

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>