2017

Deep & Cross Network for Ad Click Predictions

Mingliang Wang

citations

Cite Score

48

AI summary

This paper introduces the Deep & Cross Network (DCN) model for web-scale automatic feature learning, which efficiently captures feature interactions of bounded degrees and learns highly nonlinear interactions, achieving state-of-the-art performance on the Criteo CTR dataset.

Main Contributions

  • Introduces a novel cross network that explicitly applies feature crossing at each layer, efficiently learns predictive cross features of bounded degrees, and requires no manual feature engineering or exhaustive searching.
  • The cross network is simple yet effective. By design, the highest polynomial degree increases at each layer and is determined by layer depth. The network consists of all the cross terms of degree up to the highest, with their coefficients all different.
  • The cross network is memory efficient, and easy to implement.
  • Experimental results demonstrate that with a cross network, DCN has lower logloss than a DNN with nearly an order of magnitude fewer number of parameters.
  • DCN outperforms state-of-the-art algorithms on both sparse and dense datasets, in terms of both model accuracy and memory usage.

Abstract

Feature engineering has been the key to the success of many prediction models. However, the process is nontrivial and often requires manual feature engineering or exhaustive searching. DNNs are able to automatically learn feature interactions; however, they generate all the interactions implicitly, and are not necessarily efficient in learning all types of cross features. In this paper, we propose the Deep & Cross Network (DCN) which keeps the benefits of a DNN model, and beyond that, it introduces a novel cross network that is more efficient in learning certain bounded-degree feature interactions. In particular, DCN explicitly applies feature crossing at each layer, requires no manual feature engineering, and adds negligible extra complexity to the DNN model. Our experimental results have demonstrated its superiority over the state-of-art algorithms on the CTR prediction dataset and dense classification dataset, in terms of both model accuracy and memory usage.

Citation Graph

Loading graph...

References [18]

Sort:
Filter:

K. He, X. Zhang, S. Ren, Jian Sun - 2016

20 papers in library cite

D. P. Kingma, Jimmy Lei Ba - 2014

49 papers in library cite

I. Goodfellow, Yoshua Bengio, Y. A. Courville, A. Aaron - 2016

5 papers in library cite

S. Ioffe, Christian Szegedy - 2015

18 papers in library cite

Jürgen Schmidhuber - 2015

2 papers in library cite

A. Veit, M. J. Wilber, S. Belongie - 2016

4 papers in library cite

W. Rudin, Others - 1964

2 papers in library cite

Y. Shan, T. R. Hoens, J. Jiao, Haiming Wang, D. Yu, J. Mao - 2016

1 paper in library cites

S. Rendle - 2010

1 paper in library cites

S. Rendle - 2012

1 paper in library cites

Y. Juan, Y. Zhuang, W. S. Chin, C. J. Lin - 2016

1 paper in library cites

Y. Juan, D. Lefortier, O. Chapelle - 2017

1 paper in library cites

M. Blondel, A. Fujino, N. Ueda, M. Ishihata - 2016

1 paper in library cites

G. Valiant - 2014

1 paper in library cites

K. Canini - 2012

1 paper in library cites

O. Chapelle, E. Manavoglu, R. Rosales - 2015

1 paper in library cites

Jihan Yang, A. Giens - 2015

1 paper in library cites

H. T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, Wenhao Chai, M. Ispir, Others - 2016

1 paper in library cites

Cited by

0

papers in your library

Cites

6

papers in your library

Read

on November 23, 2025

Your review

Tags

Paper Aliases

No aliases