Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • C csvkit
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 61
    • Issues 61
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 4
    • Merge requests 4
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • wireservice
  • csvkit
  • Issues
  • #854
Closed
Open
Issue created Jul 10, 2017 by Administrator@rootContributor

csvgrep throws UTF-8 error, but on same data csvcut and csvlook don't

Created by: tlongers

Having a UTF-8 issue with csvgrep that I can't resolve through examining previous issues.

1. The csvkit command causing the issue

csvgrep -c 1 -m "Operación" test.csv

2. The input file

test.txt (.csv renamed .txt for Github)

Running file test.csv gives the following:

test.csv: UTF-8 Unicode text

3. The output text (including the traceback)

Message: Your file is not "utf-8" encoded. Please specify the correct encoding with the -e flag. Use the -v flag to see the complete error.

Traceback:

name
Traceback (most recent call last):
  File "/usr/local/bin/csvgrep", line 11, in <module>
    sys.exit(launch_new_instance())
  File "/usr/local/lib/python2.7/site-packages/csvkit/utilities/csvgrep.py", line 71, in launch_new_instance
    utility.run()
  File "/usr/local/lib/python2.7/site-packages/csvkit/cli.py", line 114, in run
    self.main()
  File "/usr/local/lib/python2.7/site-packages/csvkit/utilities/csvgrep.py", line 65, in main
    for row in filter_reader:
  File "/usr/local/lib/python2.7/site-packages/six.py", line 558, in next
    return type(self).__next__(self)
  File "/usr/local/lib/python2.7/site-packages/csvkit/grep.py", line 60, in __next__
    if self.test_row(row):
  File "/usr/local/lib/python2.7/site-packages/csvkit/grep.py", line 71, in test_row
    result = test(value)
  File "/usr/local/lib/python2.7/site-packages/csvkit/grep.py", line 122, in <lambda>
    return lambda x: obj in x
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)

4. Versions / locale etc

csvgrep -V
csvgrep 1.0.2
Python --version
Python 2.7.13
pip -V
pip 9.0.1 from /usr/local/lib/python2.7/site-packages (python 2.7)
echo $LANG
en_GB.UTF-8

5. Operating system and version

OSX Sierra 10.12.5 (16F73)

6. Remarks

The same issue does not occur with csvcut or csvlook.

csvcut -c 1 test.csv | csvlook

Produces this output:

| name      |
| --------- |
| Operación |
Assignee
Assign to
Time tracking