Data Preparation
Time Series Processing
This endpoint allows users to upload a file and perform time series processing on the data. Users can specify datetime handling, resampling, feature extraction, lag creation, rolling windows, and forecasting.
Endpoint: POST /timeseries
Request Parameters
File Upload
file
(required): The file to be processed. Supported formats include CSV, Excel, etc.
Form Parameters
datetime_column
(optional): The column containing datetime values. If not provided, the API will attempt to auto-detect datetime columns.value_columns
(optional): A list of columns containing numeric values to be processed.resample_frequency
(optional): The frequency for resampling (e.g.,D
for daily,H
for hourly). If provided, resampling is applied.resample_method
(optional, default:mean
): The method for resampling. Supported values:mean
: Resample using the mean.sum
: Resample using the sum.ffill
: Forward fill missing values.none
: No resampling.
timezone_handling
(optional, default:keep
): How to handle timezones. Supported values:keep
: Keep the original timezone.convert_to_utc
: Convert datetime values to UTC.strip
: Remove timezone information.
time_features
(optional): A comma-separated list of time-based features to extract. Supported values:hour
: Extract the hour.day_of_week
: Extract the day of the week.month
: Extract the month.quarter
: Extract the quarter.day_of_year
: Extract the day of the year.
lag_periods
(optional, default:1,7,30
): A comma-separated list of lag periods to create.rolling_windows
(optional, default:3,7
): A comma-separated list of rolling window sizes.rolling_aggregations
(optional, default:mean,std
): A comma-separated list of aggregation methods for rolling windows. Supported values:mean
: Rolling mean.std
: Rolling standard deviation.sum
: Rolling sum.max
: Rolling maximum.min
: Rolling minimum.
window_size
(optional, default:30
): The size of the rolling window.forecast_horizon
(optional, default:1
): The forecast horizon for creating future values.
Processing Logic
- Read the uploaded file: The file is converted into a Pandas DataFrame.
- Auto-detect datetime columns: If
datetime_column
is not provided, the API attempts to detect datetime columns. - Handle timezones: Applies the specified timezone handling (
keep
,convert_to_utc
, orstrip
). - Resample data: If
resample_frequency
is provided, the data is resampled using the specified method (mean
,sum
,ffill
, ornone
). - Extract time features: If
time_features
is provided, extracts the specified time-based features. - Create lag features: Creates lag features for the specified
lag_periods
. - Apply rolling windows: Applies rolling windows with the specified sizes and aggregation methods.
- Create forecast horizon: Adds a forecast column shifted by the specified
forecast_horizon
. - Return the modified file: The processed file is provided for download.