python

2023-12-11

open terminal

conda create <name>

to activate this environment,use

conda activate <name>

to deactivate

conda deactivate 

to put it into jupyter kernel

sudo python -m ipykernel install --name <name>

pandas合并两个csv文件

pandas中,string以object的形式出现

merge,join,concat和append,其中merge和join适用于横向的合并(axis = 1),concat和append则更适用于纵向的合并(axis = 0)。 image

sklearn

in high-dimensional spaces, data can more easily be separated linearly and the simplicity of classifiers such as naive Bayes and linear SVMs might lead to better generalization than is achieved by other classifiers.

预处理Encoding categorical features

sklearn.preprocessing.LabelEncoder

LabelEncoder的输入是一维,比如1d ndarray OrdinalEncoder的输入是二维,比如 DataFrame image

特征缩放

CountVectorizer中包括文本预处理,标记化和停用词过滤功能

### image