AI管家

使用 python 对csv 数据进行分析，导出表格中相同数据数量，统计某一列中各个选项的重复数量，并导出为新的表格

It appears there was an issue executing the Python code due to an internal error. However, I can guide you through the process of splitting a large CSV file into smaller files, each containing 1,000,000 rows, using Python and pandas. Here’s how you can do it:

Import pandas: First, you need to import the pandas library. If you don’t have pandas installed, you can install it using pip: pip install pandas.
Read and Split the CSV File:
- Use pandas.read_csv with the chunksize parameter set to 1,000,000. This will read the file in chunks of 1,000,000 rows at a time.
- Iterate over these chunks, and for each chunk, save it as a new CSV file.

Example Code:

import pandas as pd

def split_csv(file_path, chunk_size=1000000):
    # Read the large csv file with specified chunksize 
    chunk_generator = pd.read_csv(file_path, chunksize=chunk_size)

    for i, chunk in enumerate(chunk_generator):
        # Define the name for the smaller chunk files
        new_file_name = f'chunk_{i+1}.csv'
        # Save the chunk to a new file
        chunk.to_csv(new_file_name, index=False)
        print(f'Chunk {i+1} saved as {new_file_name}')

# Replace '/path/to/your/large/file.csv' with the path to your large CSV file
split_csv('/path/to/your/large/file.csv')

In this code:

Replace '/path/to/your/large/file.csv' with the path to your large CSV file.
The function split_csv reads your large CSV file in chunks and saves each chunk as a new file named chunk_1.csv, chunk_2.csv, and so on.

Remember, this process might take some time depending on the size of the CSV file and the performance of your computer.

1 12 月, 2023

AI助手

未分类

使用 python 对csv 数据进行分析，导出表格中相同数据数量，统计某一列中 各个选项的重复数量，并导出为新的表格

发表回复 取消回复

使用 python 对csv 数据进行分析，导出表格中相同数据数量，统计某一列中各个选项的重复数量，并导出为新的表格

发表回复取消回复