Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • C csvkit
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 61
    • Issues 61
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 4
    • Merge requests 4
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • wireservice
  • csvkit
  • Issues
  • #93
Closed
Open
Issue created Aug 04, 2011 by Administrator@rootContributor

sniffer slowness

Created by: thatmattbone

I have a somewhat large csv (~35k lines) that takes quite a while to run csvstat over. It seems like the cause of the slowness is the call out to python's csv.Sniffer().sniff() which runs over the entire contents of the file.

My simple fix was to limit the sniff to the first 4096 bytes of the file. Maybe this should be a command line flag? Or maybe sniffer.py should not use csv.Sniffer().sniff() to determine the dialect if the quotechar, delimiter, etc is specified?

Either way I'm happy to make a patch.

Assignee
Assign to
Time tracking