- 积分
- 4031
- 贡献
-
- 精华
- 在线时间
- 小时
- 注册时间
- 2021-3-11
- 最后登录
- 1970-1-1
|
登录后查看更多精彩内容~
您需要 登录 才可以下载或查看,没有帐号?立即注册
x
本帖最后由 lxy287131416 于 2022-3-16 18:07 编辑
最近写代码处理数据发现文件太长了pandas会报错,出现下列错误
Traceback (most recent call last):
File "C:/Users/Administrator/PycharmProjects/pythonProject/Delete repetition.py", line 4, in <module>
df = pd.DataFrame(pd.read_csv(inpath, header=0, sep="\s+", skiprows=[0], #error_bad_lines=False
File "D:\ProgramData\Anaconda3\envs\wind1\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "D:\ProgramData\Anaconda3\envs\wind1\lib\site-packages\pandas\io\parsers\readers.py", line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\ProgramData\Anaconda3\envs\wind1\lib\site-packages\pandas\io\parsers\readers.py", line 488, in _read
return parser.read(nrows)
File "D:\ProgramData\Anaconda3\envs\wind1\lib\site-packages\pandas\io\parsers\readers.py", line 1047, in read
index, columns, col_dict = self._engine.read(nrows)
File "D:\ProgramData\Anaconda3\envs\wind1\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 223, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas\_libs\parsers.pyx", line 801, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas\_libs\parsers.pyx", line 857, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 1925, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 8 fields in line 133001, saw 9
参考了这篇文章
https://blog.csdn.net/weixin_32820767/article/details/82287671
读取数据时代码中加入里面的 error_bad_lines=False
df = pd.DataFrame(pd.read_csv(inpath, header=0, sep="\s+", skiprows=[0], error_bad_lines=False
))
就解决了
虽然有红字,但是不影响运行
|
|