22-08-2023 дата публикации
Номер: CN116628432A
Автор:
XU YUANTU,
DONG FUDE,
HUANG RONGJIE,
HUA YAO,
WANG WEIJIE,
ZHANG PEIPEI,
LIANG JIANHUI,
XUE BOWEN,
GUO JINGYU,
YANG HAO,
ZHAO WEN,
ZHU DEQIANG,
CHEN BOTAO,
PAN RONGBO,
ZHONG FENFANG,
PAN QIAN,
LI BINGKUN,
CAI WEICONG,
PENG XIANGANG
Принадлежит:
The embodiment of the invention discloses a data cleaning method and device, electronic equipment and a storage medium, and the data cleaning method comprises the steps: inputting a plurality of pieces of to-be-cleaned data corresponding to each dimension into a pre-trained gradient lifting and crisscross optimization algorithm combination model CSO-XGBoost model, through a CSO-XGBoost model, outputting prediction data corresponding to each piece of to-be-cleaned data based on the plurality of pieces of to-be-cleaned data corresponding to each dimension; and cleaning each piece of to-be-cleaned data based on the prediction data corresponding to each piece of to-be-cleaned data. According to the method, the CSO-XGBoost model can be applied to the field of data cleaning, abnormal value elimination and data filling work after abnormal value elimination are well completed through the CSO-XGBoost model, the data are cleaned, and the integrity, consistency and effectiveness of the data are guaranteed ...
Подробнее