目次

2023.05.31 Pandas Error

Pandasでreadする時に、一部列が多くなっていたりすとカラム名が分からずエラーになります。

ERROR

Traceback (most recent call last):
  File "C:\Users\s-matsui\Desktop\_kabu\thread_board_kabucom2.py", line 1469, in <module>
    main()
  File "C:\Users\s-matsui\Desktop\_kabu\thread_board_kabucom2.py", line 1465, in main
    schedule.run_pending()
  File "c:\py\Lib\site-packages\schedule\__init__.py", line 822, in run_pending
    default_scheduler.run_pending()
  File "c:\py\Lib\site-packages\schedule\__init__.py", line 100, in run_pending
    self._run_job(job)
  File "c:\py\Lib\site-packages\schedule\__init__.py", line 172, in _run_job
    ret = job.run()
          ^^^^^^^^^
  File "c:\py\Lib\site-packages\schedule\__init__.py", line 693, in run
    ret = self.job_func()
          ^^^^^^^^^^^^^^^
  File "C:\Users\s-matsui\Desktop\_kabu\thread_board_kabucom2.py", line 1012, in sell_job_all
    delete_old_log(remaine_line)
  File "C:\Users\s-matsui\Desktop\_kabu\thread_board_kabucom2.py", line 1213, in delete_old_log
    df = pd.read_csv(path)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\py\Lib\site-packages\pandas\io\parsers\readers.py", line 912, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\py\Lib\site-packages\pandas\io\parsers\readers.py", line 583, in _read
    return parser.read(nrows)
           ^^^^^^^^^^^^^^^^^^
  File "c:\py\Lib\site-packages\pandas\io\parsers\readers.py", line 1704, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\py\Lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 234, in read
    chunks = self._reader.read_low_memory(nrows)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pandas\_libs\parsers.pyx", line 812, in pandas._libs.parsers.TextReader.read_low_memory
  File "pandas\_libs\parsers.pyx", line 873, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 848, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas\_libs\parsers.pyx", line 859, in pandas._libs.parsers.TextReader._check_tokenize_status
  File "pandas\_libs\parsers.pyx", line 2025, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 21 fields in line 155, saw 41

対応

これをそのままread_csvで読むとエラーになります。

a,a,a
b,b,b
c,c,c,c,c

こうやってカラム名を与えてあげると、読み込めます。
このrangeの数字は、カラム数+1で指定してあげる。

 col = range(1,6,1)
 df = pd.read_csv(path, names=col)

   1  2  3    4    5
0  a  a  a  NaN  NaN
1  b  b  b  NaN  NaN
2  c  c  c    c    c