No statistical algorithm can be the master of all machine learning application domains. That’s because the domain knowledge encoded in that algorithm is specific to the analytical challenge for which it was constructed. If you try to apply that same algorithm to a data source that differs in some way, large or small, from the original domain’s training data, its predictive power may fall flat.
That said, a new application domain may have so much in common with prior applications that data scientists can’t be blamed for trying to reuse hard-won knowledge from prior models. This is a well-established but fast-evolving frontier of data science known as “transfer learning” (but goes by other names such as knowledge transfer, inductive transfer, and meta learning).
Transfer learning refers to reuse of some or all of the training data, feature representations, neural-node layering, weights, training method, loss function, learning rate, and other properties of a prior model.
Transfer learning is a supplement to, not a replacement for, other learning techniques that form the backbone of most data science practices. Typically, a data scientist relies on transfer learning to tap into statistical knowledge that was gained on prior projects through supervised, semi-supervised, unsupervised, or reinforcement learning.
For data scientists, there are several practical uses of transfer learning.
Modeling productivity acceleration
If data scientists can reuse prior work without the need to revise it extensively, transfer-learning techniques can greatly boost their productivity and accelerate time to insight on new modeling projects. In fact, many projects in machine learning and deep learning address solution domains for which there is ample prior work that can be reused to kick-start development and training of fresh neural networks.
It is also useful if there are close parallels or affinities between the source and target domains. For example, a natural-language processing algorithm that was built to classify English-language technical documents in one scientific discipline should, in theory, be readily adaptable to classifying Spanish-language documents in a related field. Likewise, deep learning knowledge that was gained from training a robot to navigate through a maze may also be partially applicable to helping it learn to make its way through a dynamic obstacle course.
If a new application domain lacks sufficient amounts of labeled training data of high quality, transfer learning can help data scientists to craft machine learning models that leverage relevant training data from prior modeling projects. As noted in this excellent research paper, transfer learning is an essential capability to address machine learning projects in which prior training data can become easily outdated. This problem of training-data obsolescence often happens in dynamic problem domains, such as trying to gauge social sentiment or track patterns in sensor data.
An example, cited in the paper, is the difficulty of training the machine-learning models that drive Wi-Fi indoor localization, considering that the key data—signal strength—behind these models may vary widely over the time periods and devices used to collect the data. Transfer learning is also critical to the success of IoT deep learning applications that generate complex machine-generated information of such staggering volume, velocity, and variety that one would never be able to find enough expert human beings to label enough of it to kick-start training of new models.
If the underlying conditions of the phenomenon modeled have radically changed, thereby rendering prior training data sets or feature models inapplicable, transfer learning can help data scientists leverage useful subsets of training data and feature models from related domains. As discussed in this recent Harvard Business Review article, the data scientists who got the 2016 U.S. presidential election dead wrong could have benefited from statistical knowledge gained in postmortem studies of failed predictions from the U.K. Brexit fiasco.
Transfer learning can help data scientists mitigate the risks of machine-learning-driven predictions in any problem domain susceptible to highly improbable events. For example, cross-fertilization of statistical knowledge from meteorological models may be useful in predicting “perfect storms” of congestion in traffic management. Likewise, historical data on “black swans” in economics, such as stock-market crashes and severe depressions, may be useful in predicting catastrophic developments in politics and epidemiology.
Transfer learning isn’t only a productivity tool to assist data scientists with their next modeling challenge. It also stands at the forefront of the data science community’s efforts to invent “master learning algorithms” that automatically gain and apply fresh contextual knowledge through deep neural networks and other forms of AI.
Clearly, humanity is nowhere close to fashioning such a “superintelligence” -- and some people, fearing a robot apocalypse or similar dystopia, hope we never do. But it’s not far-fetched to predict that, as data scientists encode more of the world’s practical knowledge in statistical models, these AI nuggets will be composed into machine intelligence of staggering sophistication.
Transfer learning will become a membrane through which this statistical knowledge infuses everything in our world.