N-beats learning&practice

Posted on 2022-09-22 Edited on 2022-09-27 In ML

学照死学！这次除了学模型还要学一下大佬的代码开发格式 orz

模型学习

引入

N-BEATS model is based on backward and forward residual links and a very deep stack of fully-connected layers. Our architecture design methodology relies on a few key principles. First, the base architecture should be simple and generic, yet expressive (deep). Second, the architecture should not rely on timeseries-specific feature engineering or input scaling. These prerequisites let us explore the potential of pure DL architecture in TS horizoning. Finally, as a prerequisite to explore interpretability, the architecture should be extendable towards making its outputs human interpretable. We now discuss how those principles converge to the proposed architecture.

一开头就让我十分感动作者提出了两点

First, the base architecture should be simple and generic, yet expressive (deep).
Second, the architecture should not rely on timeseries-specific feature engineering or input scaling

基础的框架应该简单和通用，且具有很强张力（这里我胡乱解释

第二，框架不应依赖特定的时间序列特征工程或缩放输入（为什么我们继续往下看

模型结构

这里可能会涉及到残差网络的知识

https://medium.com/analytics-vidhya/deep-residual-learning-for-image-recognition-resnet-94a9c71334c9

我们先接着往下看

Internal layers

block长啥样？

一个block由四层的全连接层分出两个分叉，第一种方式是试图重建back_horizon输入，而第二种方式是试图预测horizon

stack长啥样？

trend

what is trend？

时间序列的趋势部分表示该序列平均值的持续、长期变化。趋势是一系列中移动最慢的部分，代表了最大时间尺度的重要性。在产品销售的时间序列中，随着越来越多的人逐年意识到该产品，销售量可能是增加的趋势。

（这论文咋写这么复杂 555

说θ这一串是由FC l层 stack s 预测出来的多项式系数

这个 T 为啥这样搞我还没弄明白 o.o

不着急先往后看

Seasonality

如何从零开始构建一个模型？(only copy and learn)

先导包

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from sklearn.metrics import median_absolute_error
import random
import datetime

import tensorflow as tf
from tensorflow import keras
from tensorflow.python.keras.losses import LossFunctionWrapper
from tensorflow.python.keras.utils import losses_utils
import tensorflow_probability as tfp
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping
tfd = tfp.distributions

import matplotlib.pyplot as plt
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px

!pip install -q -U keras-tuner

import keras_tuner as kt
import IPython

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

定义损失函数

https://www.kaggle.com/code/gatandubuc/forecast-with-n-beats-interpretable-model?scriptVersionId=95425582&cellId=7

面向对象定义 stack

class Stack(tf.keras.layers.Layer):
    """A stack is a series of blocks where each block produce two outputs, the horizon and the back_horizon. 
    All of the outputs are sum up which compose the stack output while each residual back_horizon is given to the following block.
    
    Parameters
    ----------
    blocks: list of `TrendBlock`, `SeasonalityBlock` or `GenericBlock`.
        Define blocks in a stack.
    """
    def __init__(self, blocks, **kwargs):
        
        super().__init__(**kwargs)

        self._blocks = blocks

    def call(self, inputs):

        y_horizon = 0.
        for block in self._blocks:
            residual_y, y_back_horizon = block(inputs) # shape: (n_quantiles, Batch_size, horizon), (Batch_size, back_horizon)
            inputs = tf.subtract(inputs, y_back_horizon)
            y_horizon = tf.add(y_horizon, residual_y) # shape: (n_quantiles, Batch_size, horizon)

        return y_horizon, inputs

stack 由很多 block 组成所以这里先是定义了 blocks 参数

然后每个 block 把输入(input)扔到 block() 里，得到 residual_y, y_back_horizon

然后把 input 和y_back_horizon 做减法得到下一层的 input

tf.subtract(
    x,
    y,
    name=None
)
#返回 x-y 的元素.
#注意：Subtract 支持广播.

然后 y_horizon = tf.add(y_horizon, residual_y)全给他加到一个 y_horizon 里

然后返回加过的 y_horizon 和需要扔到下层的 input return y_horizon, inputs

还记得吗

总结面向对象 2333 然后有点像链表

class N_BEATS(tf.keras.Model):
    """This class compute the N-BEATS model. This is a univariate model which can be
     interpretable or generic. It's strong advantage is its internal structure which allows us 
     to extract the trend and the seasonality of a temporal serie. It's available from the attributes
     `seasonality` and `trend`. This is an unofficial implementation.

     `@inproceedings{
        Oreshkin2020:N-BEATS,
        title={{N-BEATS}: Neural basis expansion analysis for interpretable time series horizoning},
        author={Boris N. Oreshkin and Dmitri Carpov and Nicolas Chapados and Yoshua Bengio},
        booktitle={International Conference on Learning Representations},
        year={2020},
        url={https://openreview.net/forum?id=r1ecqn4YwB}
        }`
    
    Parameter
    ---------
    stacks: list of `Stack` layer.
        Define the stack to use in nbeats model. It can be full of `TrendBlock`, `SeasonalityBlock` or `GenereicBlock`.
    """
    def __init__(self, 
                 stacks,
                 **kwargs):
                
        super().__init__(**kwargs)

        self._stacks = stacks

    def call(self, inputs):
        self._residuals_y = tf.TensorArray(tf.float32, size=len(self._stacks)) # Stock trend and seasonality curves during inference
        y_horizon = 0.
        for idx, stack in enumerate(self._stacks):
            residual_y, inputs = stack(inputs)
            self._residuals_y.write(idx, residual_y)
            y_horizon = tf.add(y_horizon, residual_y)

        return y_horizon

    @property
    def seasonality(self):
        return self._residuals_y.stack()[1]

    @property
    def trend(self):
        return self._residuals_y.stack()[0]

ok 我们对照着图来看看这段代码在干个什么事

用一个 _residuals_y 来存图中的 Horizon nH 的 Period 的特征（暂时还不知道具体是啥应该是通过 trend 和 seasonality 来提取出来）

开发 trend 和 seasonality

trend

class TrendBlock(tf.keras.layers.Layer):
    """ Trend block definition. Output layers are constrained which define polynomial function of small degree p.
    Therefore it is possible to get explanation from this block.
    
    Parameter
    ---------
    p_degree: integer
        Degree of the polynomial function.
    horizon: integer
        Horizon time to horizon.
    back_horizon: integer
        Past to rebuild.
    n_neurons: integer
        Number of neurons in Fully connected layers.
    n_quantiles: Integer.
        Number of quantiles in `QuantileLossError`.
    """
    def __init__(self, 
                 horizon, 
                 back_horizon,
                 p_degree,   
                 n_neurons, 
                 n_quantiles, 
                 dropout_rate,
                 **kwargs):

        super().__init__(**kwargs)
        
        self._p_degree = tf.reshape(tf.range(p_degree + 1, dtype='float32'), shape=(-1, 1)) # Shape (-1, 1) in order to broadcast horizon to all p degrees
        self._horizon = tf.cast(horizon, dtype='float32') 
        self._back_horizon = tf.cast(back_horizon, dtype='float32')
        self._n_neurons = n_neurons 
        self._n_quantiles = n_quantiles

        self._FC_stack = [tf.keras.layers.Dense(n_neurons, 
                                            activation='relu', 
                                            kernel_initializer="glorot_uniform") for _ in range(4)]
        
        self._dropout = tf.keras.layers.Dropout(dropout_rate)
        
        self._FC_back_horizon = self.add_weight(shape=(n_neurons, p_degree + 1), 
                                           trainable=True,
                                           initializer="glorot_uniform",
                                           name='FC_back_horizon_trend')
        
        self._FC_horizon = self.add_weight(shape=(n_quantiles, n_neurons, p_degree + 1),
                                           trainable=True,
                                           initializer="glorot_uniform",
                                           name='FC_horizon_trend')

        self._horizon_coef = (tf.range(self._horizon) / self._horizon) ** self._p_degree
        self._back_horizon_coef = (tf.range(self._back_horizon) / self._back_horizon) ** self._p_degree
        
    def call(self, inputs):

        for dense in self._FC_stack:
            inputs = dense(inputs) # shape: (Batch_size, n_neurons)
            inputs = self._dropout(inputs, training=True) # We bind first layers by a dropout 
            
        theta_back_horizon = inputs @ self._FC_back_horizon # shape: (Batch_size, p_degree)
        theta_horizon = inputs @ self._FC_horizon # shape: (n_quantiles, Batch_size, p_degree)

        y_back_horizon = theta_back_horizon @ self._back_horizon_coef # shape: (Batch_size, back_horizon)
        y_horizon = theta_horizon @ self._horizon_coef # shape: (n_quantiles, Batch_size, horizon)
        
        return y_horizon, y_back_horizon

Seasonality

class SeasonalityBlock(tf.keras.layers.Layer):
    """Seasonality block definition. Output layers are constrained which define fourier series. 
    Each expansion coefficent then become a coefficient of the fourier serie. As each block and each 
    stack outputs are sum up, we decided to introduce fourier order and multiple seasonality periods.
    Therefore it is possible to get explanation from this block.
    
    Parameters
    ----------
    horizon: integer
        Horizon time to horizon.
    back_horizon: integer
        Past to rebuild.
    n_neurons: integer
        Number of neurons in Fully connected layers.
    periods: Integer.
        fourier serie period. The paper set this parameter to `horizon/2`.
    back_periods: Integer.
        fourier serie back period. The paper set this parameter to `back_horizon/2`.
    horizon_fourier_order: Integer.
        Higher values signifies complex fourier serie
    back_horizon_fourier_order: Integer.
        Higher values signifies complex fourier serie
    n_quantiles: Integer.
        Number of quantiles in `QuantileLossError`.
    """
    def __init__(self,
                 horizon,
                 back_horizon,
                 n_neurons, 
                 periods, 
                 back_periods, 
                 horizon_fourier_order,
                 back_horizon_fourier_order,
                 n_quantiles,
                 dropout_rate,
                 **kwargs):
        
        super().__init__(**kwargs)

        self._horizon = horizon
        self._back_horizon = back_horizon
        self._periods = tf.cast(tf.reshape(periods, (1, -1)), 'float32') # Broadcast horizon on multiple periods
        self._back_periods = tf.cast(tf.reshape(back_periods, (1, -1)), 'float32')  # Broadcast back horizon on multiple periods
        self._horizon_fourier_order = tf.reshape(tf.range(horizon_fourier_order, dtype='float32'), shape=(-1, 1)) # Broadcast horizon on multiple fourier order
        self._back_horizon_fourier_order = tf.reshape(tf.range(back_horizon_fourier_order, dtype='float32'), shape=(-1, 1)) # Broadcast horizon on multiple fourier order

        # Workout the number of neurons needed to compute seasonality coefficients
        horizon_neurons = tf.reduce_sum(2 * horizon_fourier_order)
        back_horizon_neurons = tf.reduce_sum(2 * back_horizon_fourier_order)
        
        self._FC_stack = [tf.keras.layers.Dense(n_neurons, 
                                               activation='relu', 
                                               kernel_initializer="glorot_uniform") for _ in range(4)]
        
        self._dropout = tf.keras.layers.Dropout(dropout_rate)   
        
        self._FC_back_horizon = self.add_weight(shape=(n_neurons, back_horizon_neurons), 
                                           trainable=True,
                                           initializer="glorot_uniform",
                                           name='FC_back_horizon_seasonality')

        self._FC_horizon = self.add_weight(shape=(n_quantiles, n_neurons, horizon_neurons), 
                                           trainable=True,
                                           initializer="glorot_uniform",
                                           name='FC_horizon_seasonality')
        
        # Workout cos and sin seasonality coefficents
        time_horizon = tf.range(self._horizon, dtype='float32') / self._periods
        horizon_seasonality = 2 * np.pi * self._horizon_fourier_order * time_horizon
        self._horizon_coef = tf.concat((tf.cos(horizon_seasonality), 
                                          tf.sin(horizon_seasonality)), axis=0)

        time_back_horizon = tf.range(self._back_horizon, dtype='float32') / self._back_periods
        back_horizon_seasonality = 2 * np.pi * self._back_horizon_fourier_order * time_back_horizon
        self._back_horizon_coef = tf.concat((tf.cos(back_horizon_seasonality), 
                                        tf.sin(back_horizon_seasonality)), axis=0)
        
    def call(self, inputs):

        for dense in self._FC_stack:
            inputs = dense(inputs) # shape: (Batch_size, nb_neurons)
            inputs = self._dropout(inputs, training=True) # We bind first layers by a dropout 

        theta_horizon = inputs @ self._FC_horizon # shape: (n_quantiles, Batch_size, 2 * fourier order)
        theta_back_horizon = inputs @ self._FC_back_horizon # shape: (Batch_size, 2 * fourier order)
        
        y_horizon = theta_horizon @ self._horizon_coef # shape: (n_quantiles, Batch_size, horizon)
        y_back_horizon = theta_back_horizon @ self._back_horizon_coef # shape: (Batch_size, back_horizon)
    
        return y_horizon, y_back_horizon