分類演算法中，訓練集和驗證集有什麼區別？

1樓：武宗海山

一般來說，訓練集佔據了資料集的majority（例如百分之80），訓練集用於確定模型的basic引數。而驗證集（假設佔據10%）則是在訓練模型的過程中不斷調整basic引數，也就是常說的「調參」。當模型的引數最終確定後，停止訓練，採用測試集（10%）來評估模型的泛化效能。

2樓：和煦

for each epoch

for each training data instance

propagate error through the network

adjust the weights

calculate the accuracy over training data

for each validation data instance

calculate the accuracy over the validation data

if the threshold validation accuracy is met

exit training

else

continue training

Once you're finished training, then you run against your testing set and verify that the accuracy is sufficient.

Training Set:this data set is used to adjust the weights on the neural network.

Validation Set:this data set is used to minimize overfitting. You're not adjusting the weights of the network with this data set, you're just verifying that any increase in accuracy over the training data set actually yields an increase in accuracy over a data set that has not been shown to the network before, or at least the network hasn't trained on it (i.

e. validation data set). If the accuracy over the training data set increases, but the accuracy over then validation data set stays the same or decreases, then you're overfitting your neural network and you should stop training.

Testing Set:this data set is used only for testing the final solution in order to confirm the actual predictive power of the network.

Validating set is used in the process of training. Testing set is not. The Testing set allows

1)to see if the training set was enough and 2)whether the validation set did the job of preventing overfitting.

3樓：shirley

Training set: A set of examples used for learning, which is to fit the parameters [i.e.

, weights] of the classifier.

Validation set:A set of examples used to tune the parameters [i.e.,

architecture, not weights] of a classifier, for example to choose the

number of hidden units in a neural network.

Test set: A set of examples used only to assess the performance [generalization] of a fully specified classifier.

4樓：

訓練集（train set）：用於訓練模型以及確定模型權重。

驗證集（validation set）：用於確定網路結構以及調整模型的超引數。

測試集（test set）：用於檢驗模型的泛化能力。

如何有效的評估模型？

5樓：小宇

1.a set of methods to automatically detect patterns in data and use them to

Predict future data

Make decisions

input-output pairs:

N is the number of training examples

3.Each training input is a D-dimensional vector

Stored in an design matrix

Each dimension corresponds to a 「feature」

Each training output can:

Belong to a finite set, ∈ classification or pattern recognition

Classification with C=2 is often called 「detection」

Be a real value regression

6樓：千佛山彭于晏

原來測試集除了評估準確性之外，還有驗證模型的推廣能力的作用啊，因為不確定驗證集有無足夠的泛化能力，這也是測試集設立的原因為之一啊

分類演算法中，訓練集和驗證集有什麼區別？

LSTM訓練集和驗證集的loss曲線為什麼會是這樣？

在深度學習中，如果訓練集和測試集的範圍不一致，該如何進行歸一化或者標準化處理？

NLP中建立的訓練集詞表字典的目的是什麼呢？和已經預訓練好的詞向量之間有什麼關係？

其他用戶還看了：

分類演算法中，訓練集和驗證集有什麼區別？

LSTM訓練集和驗證集的loss曲線為什麼會是這樣？

在深度學習中，如果訓練集和測試集的範圍不一致，該如何進行歸一化或者標準化處理？

NLP中建立的訓練集詞表 字典 的目的是什麼呢？和已經預訓練好的詞向量之間有什麼關係？

其他用戶還看了：

NLP中建立的訓練集詞表字典的目的是什麼呢？和已經預訓練好的詞向量之間有什麼關係？