Feature Engineering and Similarity Interview Questions
Q: What feature engineering techniques are common in time series?
Short interview answer
I usually group them into lag features, rolling statistics, seasonal calendar features, transformations, and external covariates.
Common feature groups
- Lags:
y_(t-1), y_(t-7), y_(t-24) - Rolling stats: mean, std, min, max, quantiles
- Differences and growth rates
- Calendar features: hour, day of week, month, holiday, promotion
- Expanding-window features
- Spectral or frequency-domain summaries
Typical formulas
1
2
3
4
5
6
7
8
Lag-k feature:
x_t^(lag,k) = y_(t-k)
Rolling mean over window w:
m_t = (1 / w) Σ_(i=1 to w) y_(t-i)
Rolling variance:
s_t^2 = (1 / (w-1)) Σ_(i=1 to w) (y_(t-i) - m_t)^2
Q: How do you transform time-series data?
Short interview answer
Transformations can stabilize variance, improve stationarity, reduce skew, or make optimization easier.
Important formulas
1
2
3
4
5
6
7
8
9
10
11
12
Log transform:
z_t = log(y_t)
Box-Cox transform:
z_t = (y_t^λ - 1) / λ if λ ≠ 0
z_t = log(y_t) if λ = 0
Z-score scaling:
z_t = (y_t - μ) / σ
Min-max scaling:
z_t = (y_t - y_min) / (y_max - y_min)
Production caveat
Fit scaling parameters only on the training period. Fitting on the full series causes leakage.
Q: How do you measure similarity between two time series?
Strong answer
The choice depends on whether timing shifts should count as mismatch.
Key options
- Euclidean distance if the sequences are aligned and same length.
- Dynamic Time Warping if local time shifts should be tolerated.
- Correlation if shape matters more than scale.
- Cosine similarity if direction matters more than magnitude.
Important formulas
1
2
3
4
5
6
7
8
Euclidean distance:
d(x, y) = sqrt( Σ_i (x_i - y_i)^2 )
Cosine similarity:
cos(x, y) = (x · y) / (||x|| ||y||)
Pearson correlation:
ρ(x, y) = Cov(x, y) / (σ_x σ_y)
DTW intuition
DTW solves a dynamic-programming problem that finds the minimum-cost alignment path between two sequences, allowing local stretching and compression in time.
Q: How would you handle missing values in time series?
Strong answer
I first ask whether the missingness is informative. In operations or IoT systems, missingness may itself indicate failure or downtime.
Typical methods
- Forward fill or backward fill
- Linear interpolation
- Seasonal interpolation
- Model-based imputation
- Missingness indicator features
What interviewers like to hear
- Do not impute using future information if the prediction setting is causal.
- Evaluate the imputation method inside the training pipeline, not on the full dataset.
- For long missing spans, naive interpolation may hallucinate false structure.