Predicting new and current product gross sales in semiconductors utilizing Amazon Forecast

This can be a joint put up by NXP SEMICONDUCTORS N.V. & AWS Machine Studying Options Lab (MLSL)

Machine studying (ML) is getting used throughout a variety of industries to extract actionable insights from knowledge to streamline processes and enhance income era. On this put up, we display how NXP, an business chief within the semiconductor sector, collaborated with the AWS Machine Learning Solutions Lab (MLSL) to make use of ML methods to optimize the allocation of the NXP analysis and improvement (R&D) funds to maximise their long-term return on funding (ROI).

NXP directs its R&D efforts largely to the event of latest semiconductor options the place they see vital alternatives for development. To outpace market development, NXP invests in analysis and improvement to increase or create main market positions, with an emphasis on fast-growing, sizable market segments. For this engagement, they sought to generate month-to-month gross sales forecasts for brand spanking new and current merchandise throughout totally different materials teams and enterprise strains. On this put up, we display how the MLSL and NXP employed Amazon Forecast and different customized fashions for long-term gross sales predictions for numerous NXP merchandise.

“We engaged with the crew of scientists and consultants at [the] Amazon Machine Studying Options Lab to construct an answer for predicting new product gross sales and perceive if and which extra options may assist inform [the] decision-making course of for optimizing R&D spending. Inside only a few weeks, the crew delivered a number of options and analyses throughout a few of our enterprise strains, materials teams, and on [an] particular person product stage. MLSL delivered a gross sales forecast mannequin, which enhances our present means of handbook forecasting, and helped us mannequin the product lifecycle with novel machine studying approaches utilizing Amazon Forecast and Amazon SageMaker. Whereas retaining a continuing collaborative workstream with our crew, MLSL helped us with upskilling our professionals on the subject of scientific excellence and finest practices on ML improvement utilizing AWS infrastructure.”

– Bart Zeeman, Strategist and Analyst at CTO workplace in NXP Semiconductors.

Targets and use case

The purpose of the engagement between NXP and the MLSL crew is to foretell the general gross sales of NXP in numerous finish markets. Generally, the NXP crew is desirous about macro-level gross sales that embody the gross sales of varied enterprise strains (BLs), which include a number of materials teams (MAGs). Moreover, the NXP crew can also be desirous about predicting the product lifecycle of newly launched merchandise. The lifecycle of a product is split into 4 totally different phases (Introduction, Progress, Maturity, and Decline). The product lifecycle prediction allows the NXP crew to establish the income generated by every product to additional allocate R&D funding to the merchandise producing the very best quantities of gross sales or merchandise with the very best potential to maximise the ROI for R&D exercise. Moreover, they’ll predict the long-term gross sales on a micro stage, which provides them a bottom-up look on how their income modifications over time.

Within the following sections, we current the important thing challenges related to growing sturdy and environment friendly fashions for long-term gross sales forecasts. We additional describe the instinct behind numerous modeling methods employed to attain the specified accuracy. We then current the analysis of our remaining fashions, the place we evaluate the efficiency of the proposed fashions when it comes to gross sales prediction with the market consultants at NXP. We additionally display the efficiency of our state-of-the-art level cloud-based product lifecycle prediction algorithm.


One of many challenges we confronted whereas utilizing fine-grained or micro-level modeling like product-level fashions on the market prediction was lacking gross sales knowledge. The lacking knowledge is the results of lack of gross sales throughout each month. Equally, for macro-level gross sales prediction, the size of the historic gross sales knowledge was restricted. Each the lacking gross sales knowledge and the restricted size of historic gross sales knowledge pose vital challenges when it comes to mannequin accuracy for long-term gross sales prediction into 2026. We noticed in the course of the exploratory knowledge evaluation (EDA) that as we transfer from micro-level gross sales (product stage) to macro-level gross sales (BL stage), lacking values turn out to be much less vital. Nevertheless, the utmost size of historic gross sales knowledge (most size of 140 months) nonetheless posed vital challenges when it comes to mannequin accuracy.

Modeling methods

After EDA, we targeted on forecasting on the BL and MAG ranges and on the product stage for one of many largest finish markets (the auto finish market) for NXP. Nevertheless, the options we developed will be prolonged to different finish markets. Modeling on the BL, MAG, or product stage has its personal execs and cons when it comes to mannequin efficiency and knowledge availability. The next desk summarizes such execs and cons for every stage. For macro-level gross sales prediction, we employed the Amazon Forecast AutoPredictor for our remaining answer. Equally, for micro-level gross sales prediction, we developed a novel level cloud-based method.

Macro gross sales prediction (top-down)

To foretell the lengthy phrases gross sales values (2026) on the macro stage, we examined numerous strategies, together with Amazon Forecast, GluonTS, and N-BEATS (applied in GluonTS and PyTorch). Total, Forecast outperformed all different strategies based mostly on a backtesting method (described within the Analysis Metrics part later on this put up) for macro-level gross sales prediction. We additionally in contrast the accuracy of AutoPredictor towards human predictions.

We additionally proposed utilizing N-BEATS because of its interpretative properties. N-BEATS relies on a quite simple however highly effective structure that makes use of an ensemble of feedforward networks that make use of the residual connections with stacked residual blocks for forecasting. This structure additional encodes the inductive bias in its structure to make the time collection mannequin able to extracting development and seasonality (see the next determine). These interpretations have been generated utilizing PyTorch Forecasting.

Micro gross sales prediction (bottom-up)

On this part, we talk about a novel methodology developed to foretell the product lifecycle proven within the following determine whereas taking into account the chilly begin product. We applied this methodology utilizing PyTorch on Amazon SageMaker Studio. First, we launched a degree cloud-based methodology. This methodology first converts gross sales knowledge into a degree cloud, the place every level represents gross sales knowledge at a sure age of the product. The purpose cloud-based neural community mannequin is additional skilled utilizing this knowledge to study the parameters of the product lifecycle curve (see the next determine). On this method, we additionally included extra options, together with product description as a bag of phrases to deal with the chilly begin downside for predicting the product lifecycle curve.

Time collection as level cloud-based product lifecycle prediction

We developed a novel level cloud-based method to foretell the product lifecycle and micro-level gross sales predictions. We additionally included extra options to additional enhance the mannequin accuracy for the chilly begin product lifecycle predictions. These options embody product fabrication methods and different associated categorical data associated to the merchandise. Such extra knowledge may also help the mannequin predict gross sales of a brand new product even earlier than the product is launched available on the market (chilly begin). The next determine demonstrates the purpose cloud-based method. The mannequin takes the normalized gross sales and age of the product (variety of months for the reason that product is launched) as enter. Primarily based on these inputs, the mannequin learns parameters in the course of the coaching utilizing gradient descent. Through the forecast part, the parameters together with the options of a chilly begin product are used for predicting the lifecycle. The big variety of lacking values within the knowledge on the product stage negatively impacts almost the entire current time collection fashions. This novel answer relies on the concepts of lifecycle modeling and treating time collection knowledge as level clouds to mitigate the lacking values.

The next determine demonstrates how our level cloud-based lifecycle methodology addresses the lacking knowledge values and is able to predicting the product lifecycle with only a few coaching samples. The X-axis represents the age in time, and the Y-axis represents the gross sales of a product. Orange dots signify the coaching samples, inexperienced dots signify the testing samples, and the blue line demonstrates the anticipated lifecycle of a product by the mannequin.


To foretell macro-level gross sales, we employed Amazon Forecast amongst different methods. Equally, for micro gross sales, we developed a state-of-the-art level cloud-based customized mannequin. Forecast outperformed all different strategies when it comes to mannequin efficiency. We used Amazon SageMaker pocket book cases to create a knowledge processing pipeline that extracted coaching examples from Amazon Easy Storage Service (Amazon S3). The coaching knowledge was additional used as enter for Forecast to coach a mannequin and predict long-term gross sales.

Coaching a time collection mannequin utilizing Amazon Forecast consists of three major steps. In step one, we imported the historic knowledge into Amazon S3. Second, a predictor was skilled utilizing the historic knowledge. Lastly, we deployed the skilled predictor to generate the forecast. On this part, we offer an in depth rationalization together with code snippets of every step.

We began by extracting the newest gross sales knowledge. This step included importing the dataset to Amazon S3 within the right format. Amazon Forecast takes three columns as inputs: timestamp, item_id, and target_value (gross sales knowledge). The timestamp column incorporates the time of gross sales, which may very well be formatted as hourly, each day, and so forth. The item_id column incorporates the title of the offered objects, and the target_value column incorporates gross sales values. Subsequent, we used the trail of coaching knowledge situated in Amazon S3, outlined the time collection dataset frequency (H, D, W, M, Y), outlined a dataset title, and recognized the attributes of the dataset (mapped the respective columns within the dataset and their knowledge varieties). Subsequent, we referred to as the create_dataset perform from the Boto3 API to create a dataset with attributes resembling Area, DatasetType, DatasetName, DatasetFrequency, and Schema. This perform returned a JSON object that contained the Amazon Useful resource Identify (ARN). This ARN was subsequently used within the following steps. See the next code:

dataset_path = "PATH_OF_DATASET_IN_S3"
DATASET_FREQUENCY = "M" # Frequency of dataset (H, D, W, M, Y) 

create_dataset_response = forecast.create_dataset(Area="CUSTOM",

ts_dataset_arn = create_dataset_response['DatasetArn']

After the dataset was created, it was imported into Amazon Forecast utilizing the Boto3 create_dataset_import_job perform. The create_dataset_import_job perform takes the job title (a string worth), the ARN of the dataset from the earlier step, the situation of the coaching knowledge in Amazon S3 from the earlier step, and the time stamp format as arguments. It returns a JSON object containing the import job ARN. See the next code:


ts_dataset_import_job_response = 
                                       DataSource= {
                                         "S3Config" : {
                                             "Path": ts_s3_path,
                                             "RoleArn": role_arn
                                       TimeZone = TIMEZONE)

ts_dataset_import_job_arn = ts_dataset_import_job_response['DatasetImportJobArn']

The imported dataset was then used to create a dataset group utilizing the create_dataset_group perform. This perform takes the area (string values defining the area of the forecast), dataset group title, and the dataset ARN as inputs:

DATASET_ARNS = [ts_dataset_arn]

create_dataset_group_response = 

dataset_group_arn = create_dataset_group_response['DatasetGroupArn']

Subsequent, we used the dataset group to coach forecasting fashions. Amazon Forecast provides numerous state-of-the-art fashions; any of those fashions can be utilized for coaching. We used AutoPredictor as our default mannequin. The principle benefit of utilizing AutoPredictor is that it mechanically generates the item-level forecast, utilizing the optimum mannequin from an ensemble of six state-of-the-art fashions based mostly on the enter dataset. The Boto3 API gives the create_auto_predictor perform for coaching an auto prediction mannequin. The enter parameters of this perform are PredictorName, ForecastHorizon, and ForecastFrequency. Customers are additionally chargeable for deciding on the forecast horizon and frequency. The forecast horizon represents the window measurement of the long run prediction, which will be formatted hours, days, weeks, months, and so forth. Equally, forecast frequency represents the granularity of the forecast values, resembling hourly, each day, weekly, month-to-month, or yearly. We primarily targeted on predicting month-to-month gross sales of NXP on numerous BLs. See the next code:


create_auto_predictor_response = 
    forecast.create_auto_predictor(PredictorName = PREDICTOR_NAME,
                                   ForecastHorizon = FORECAST_HORIZON,
                                   ForecastFrequency = FORECAST_FREQUENCY,
                                   DataConfig = {
                                       'DatasetGroupArn': dataset_group_arn

predictor_arn = create_auto_predictor_response['PredictorArn']

The skilled predictor was then used to generate forecast values. Forecasts have been generated utilizing the create_forecast perform from the beforehand skilled predictor. This perform takes the title of the forecast and the ARN of the predictor as inputs and generates the forecast values for the horizon and frequency outlined within the predictor:


create_forecast_response = 

Amazon Forecast is a totally managed service that mechanically generates coaching and check datasets and gives numerous accuracy metrics to guage the reliability of the model-generated forecast. Nevertheless, to construct consensus on the anticipated knowledge and evaluate the anticipated values with human predictions, we divided our historic knowledge into coaching knowledge and validation knowledge manually. We skilled the mannequin utilizing the coaching knowledge with out exposing the mannequin to validation knowledge and generated the prediction for the size of validation knowledge. The validation knowledge was in contrast with the anticipated values to guage the mannequin efficiency. Validation metrics could embody imply absolute p.c error (MAPE) and weighted absolute p.c error (WAPE), amongst others. We used WAPE as our accuracy metric, as mentioned within the subsequent part.

Analysis metrics

We first verified the mannequin efficiency utilizing backtesting to validate the prediction of our forecast mannequin for long run gross sales forecast (2026 gross sales). We evaluated the mannequin efficiency utilizing the WAPE. The decrease the WAPE worth, the higher the mannequin. The important thing benefit of utilizing WAPE over different error metrics like MAPE is that WAPE weighs the person impression of every merchandise’s sale. Due to this fact, it accounts for every product’s contribution to the whole sale whereas calculating the general error. For instance, in the event you make an error of two% on a product that generates $30 million and an error of 10% in a product that generates $50,000, your MAPE won’t inform your complete story. The two% error is definitely costlier than the ten% error, one thing you’ll be able to’t inform by utilizing MAPE. Comparatively, WAPE will account for these variations. We additionally predicted numerous percentile values for the gross sales to display the higher and decrease bounds of the mannequin forecast.

Macro-level gross sales prediction mannequin validation

Subsequent, we validated the mannequin efficiency when it comes to WAPE values. We calculated the WAPE worth of a mannequin by splitting the information into check and validation units. For instance, within the 2019 WAPE worth, we skilled our mannequin utilizing gross sales knowledge between 2011–2018 and predicted gross sales values for the following 12 months (2019 sale). Subsequent, we calculated the WAPE worth utilizing the next components:

We repeated the identical process to calculate the WAPE worth for 2020 and 2021. We evaluated the WAPE for all BLs within the auto finish marketplace for 2019, 2020, and 2021. Total, we noticed that Amazon Forecast can obtain a 0.33 WAPE worth even for the yr of 2020 (in the course of the COVID-19 pandemic). In 2019 and 2020, our mannequin achieved lower than 0.1 WAPE values, demonstrating excessive accuracy.

Macro-level gross sales prediction baseline comparability

We in contrast the efficiency of the macro gross sales prediction fashions developed utilizing Amazon Forecast to 3 baseline fashions when it comes to WAPE worth for 2019, 2020 and 2021 (see the next determine). Amazon Forecast both considerably outperformed the opposite baseline fashions or carried out on par for all 3 years. These outcomes additional validate the effectiveness of our remaining mannequin predictions.

Macro-level gross sales prediction mannequin vs. human predictions

To additional validate the arrogance of our macro-level mannequin, we subsequent in contrast the efficiency of our mannequin with the human-predicted gross sales values. In the beginning of the fourth quarter yearly, market consultants at NXP predict the gross sales worth of every BL, taking into account world market developments in addition to different world indicators that might doubtlessly impression the gross sales of NXP merchandise. We evaluate the p.c error of the mannequin prediction vs. human prediction to the precise gross sales values in 2019, 2020, and 2021. We skilled three fashions utilizing knowledge from 2011–2018 and predicted the gross sales values till 2021. We subsequent calculated the MAPE for the precise gross sales values. We then used the human-predicted values by the top of 2018 (check the mannequin forecast 1Y forward to 3Y forward forecast). We repeated this course of to foretell the values in 2019 (1Y forward forecast to 2Y forward forecast) and 2020 (for 1Y forward forecast). Total, the mannequin carried out on par with the human predictors or higher in some circumstances. These outcomes display the effectiveness and reliability of our mannequin.

Micro-level gross sales prediction and product lifecycle

The next determine depicts how the mannequin behaves utilizing product knowledge whereas accessing only a few observations for every product (specifically one or two observations on the enter for product lifecycle prediction). The orange dots signify the coaching knowledge, the inexperienced dots signify the testing knowledge, and the blue line represents the mannequin predicted product lifecycle.

The mannequin will be fed extra observations for context with out the necessity for re-training as new gross sales knowledge turn out to be out there. The next determine demonstrates how the mannequin behaves whether it is given extra context. In the end, extra context results in decrease WAPE values.

As well as, we managed to include extra options for every product, together with fabrication methods and different categorical data. On this regard, exterior options helped scale back the WAPE worth within the low-context regime (see the next determine). There are two explanations for this conduct. First, we have to let the information converse for itself within the high-context regimes. The extra options can intervene with this course of. Second, we want higher options. We used 1,000 dimensional one-hot-encoded options (bag of phrases). The conjecture is that higher characteristic engineering methods may also help scale back WAPE even additional.

Such extra knowledge may also help the mannequin predict gross sales of latest merchandise even earlier than the product is launched available on the market. For instance, within the following determine, we plot how a lot mileage we are able to get solely out of exterior options.


On this put up, we demonstrated how the MLSL and NXP groups labored collectively to foretell macro- and micro-level long-term gross sales for NXP. The NXP crew will now discover ways to use these gross sales predictions of their processes—for instance, to make use of it as enter for R&D funding choices and improve ROI. We used Amazon Forecast to foretell the gross sales for enterprise strains (macro gross sales), which we known as the top-down method. We additionally proposed a novel method utilizing time collection as a degree cloud to deal with the challenges of lacking values and chilly begin on the product stage (micro stage). We referred to this method as bottom-up, the place we predicted the month-to-month gross sales of every product. We additional included exterior options of every product to reinforce the efficiency of the mannequin for chilly begin.

Total, the fashions developed throughout this engagement carried out on par in comparison with human prediction. In some circumstances, the fashions carried out higher than human predictions in the long run. These outcomes display the effectiveness and reliability of our fashions.

This answer will be employed for any forecasting downside. For additional help when it comes to designing and growing ML options, please free to get in contact with the MLSL crew.

In regards to the authors

Souad Boutane is a knowledge scientist at NXP-CTO, the place she is reworking numerous knowledge into significant insights to assist enterprise choice utilizing superior instruments and methods.

Ben Fridolin is a knowledge scientist at NXP-CTO, the place he coordinates on accelerating AI and cloud adoption. He focuses on machine studying, deep studying and end-to-end ML options.

Cornee Geenen is a undertaking lead within the Knowledge Portfolio of NXP supporting the group in it’s digital transformation in the direction of turning into knowledge centric.

Bart Zeeman is a strategist with a ardour for knowledge & analytics at NXP-CTO the place he’s driving for higher knowledge pushed choices for extra development and innovation.

Ahsan Ali is an Utilized Scientist on the Amazon Machine Studying Options Lab, the place he works with clients from totally different domains to resolve their pressing and costly issues utilizing state-of-the-art AI/ML methods.

Yifu Hu is an Utilized Scientist within the Amazon Machine Studying Options lab, the place he helps design artistic ML options to deal with clients’ enterprise issues in numerous industries.

Mehdi Noori is an Utilized Science Supervisor at Amazon ML Options Lab, the place he helps develop ML options for giant organizations throughout numerous industries and leads the Power vertical. He’s enthusiastic about utilizing AI/ML to assist clients obtain their Sustainability objectives.

Huzefa Rangwala is a Senior Utilized Science Supervisor at AIRE, AWS. He leads a crew of scientists and engineers to allow machine studying based mostly discovery of knowledge belongings. His analysis pursuits are in accountable AI, federated studying and purposes of ML in well being care and life sciences.

Leave a Reply

Your email address will not be published. Required fields are marked *