2007
Cite Score
57
AI summary
This paper introduces "self-taught learning," a new machine learning framework that leverages sparse coding to extract higher-level features from unlabeled data, significantly improving classification performance on tasks such as image, audio, and text classification, even when the unlabeled data does not share the same class labels or generative distribution as the labeled data.
Main Contributions
Abstract
We present a new machine learning framework called "self-taught learning" for using unlabeled data in supervised classification tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classification task. Such unlabeled data is significantly easier to obtain than in typical semi-supervised or transfer learning settings, making self-taught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and significantly improve classification performance. When using an SVM for classification, we further show how a Fisher kernel can be learned for this representation.
Citation Graph
References [23]
Geoffrey Hinton, Ruslan Salakhutdinov - 2006
37 papers in library cite
Rich Caruana - 1997
13 papers in library cite
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce - 2006
14 papers in library cite
B. Olshausen, D. Field - 1996
5 papers in library cite
Li Fei Fei, Rob Fergus, Pietro Perona - 2004
15 papers in library cite
Rie Kubota Ando, Tong Zhang - 2005
10 papers in library cite
Sebastian Thrun - 1996
3 papers in library cite
D. M. Blei, Andrew Y. Ng, Michael I. Jordan - 2003
10 papers in library cite
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, R. Harshman - 1990
12 papers in library cite
J. Tenenbaum, V. D. Silva, John Langford - 2000
7 papers in library cite
T. Serre, Lior Wolf, T. Poggio - 2005
7 papers in library cite
Honglak Lee, Alexis Battle, Rajat Raina, A. Ng - 2007
6 papers in library cite
Haowei Zhang, A. C. Berg, M. Maire, Jitendra Malik - 2006
6 papers in library cite
S. T. Roweis, L. K. Saul - 2000
5 papers in library cite
R. Tibshirani - 1996
4 papers in library cite
K. Nigam, A. K. Mccallum, Sebastian Thrun, T. Mitchell - 2000
4 papers in library cite
T. Jaakkola, D. Haussler - 1999
3 papers in library cite
Alex Holub, M. Welling, Pietro Perona - 2005
2 papers in library cite
A. Ng, A. Y. - 2004
2 papers in library cite
B. Efron, T. Hastie, I. Johnstone, R. Tibshirani - 2004
1 paper in library cites
K. Tsuda, T. Kin, K. Asai - 2002
1 paper in library cites
P. O. Hoyer - 2004
1 paper in library cites
J. Baxter - 1997
1 paper in library cites
Cited by
7
papers in your library
Cites
9
papers in your library
Read
on January 30, 2026
Your review
Tags
Paper Aliases
No aliases