read_excel(io, sheetname=0, header=0, skiprows=None, skip_footer=0, index_col=None,names=None, parse_cols=None, parse_dates=False,date_parser=None,na_values=None,thousands=None, convert_float=True, has_index_names=None, converters=None,dtype=None, true_values=None, false_values=None, engine=None, squeeze=False, **kwds)
- io : string, path object ; excel 路径。
- sheetname : string, int, mixed list of strings/ints, or None, default 0 返回多表使用sheetname=[0,1],若sheetname=None是返回全表 注意:int/string 返回的是dataframe,而none和list返回的是dict of dataframe
- header : int, list of ints, default 0 指定列名行,默认0,即取第一行,数据为列名行以下的数据 若数据不含列名,则设定 header = None
- skiprows : list-like,Rows to skip at the beginning,省略指定行数的数据
- skip_footer : int,default 0, 省略从尾部数的int行数据
- index_col : int, list of ints, default None指定列为索引列,也可以使用u”strings”
- names : array-like, default None, 指定列的名字。
sheet1: ID NUM-1 NUM-2 NUM-3 36901 142 168 661 36902 78 521 602 36903 144 600 521 36904 95 457 468 36905 69 596 695 sheet2: ID NUM-1 NUM-2 NUM-3 36906 190 527 691 36907 101 403 470
basestation ="F://pythonBook_PyPDAM/data/test.xls" data = pd.read_excel(basestation) print data
ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695
注意:int/string 返回的是dataframe,而none和list返回的是dict of dataframe
data_1 = pd.read_excel(basestation,sheetname=[0,1]) print data_1 print type(data_1)
输出:dict of dataframe
OrderedDict([(0, ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695), (1, ID NUM-1 NUM-2 NUM-3 0 36906 190 527 691 1 36907 101 403 470)])
(3)header参数:指定列名行,默认0,即取第一行,数据为列名行以下的数据 若数据不含列名,则设定 header = None ,注意这里还有列名的一行。
data = pd.read_excel(basestation,header=None) print data 输出: 0 1 2 3 0 ID NUM-1 NUM-2 NUM-3 1 36901 142 168 661 2 36902 78 521 602 3 36903 144 600 521 4 36904 95 457 468 5 36905 69 596 695 data = pd.read_excel(basestation,header=[3]) print data 输出: 36903 144 600 521 0 36904 95 457 468 1 36905 69 596 695
(4) skiprows 参数:省略指定行数的数据
data = pd.read_excel(basestation,skiprows = [1]) print data 输出: ID NUM-1 NUM-2 NUM-3 0 36902 78 521 602 1 36903 144 600 521 2 36904 95 457 468 3 36905 69 596 695
data = pd.read_excel(basestation, skip_footer=3) print data 输出: ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602
data = pd.read_excel(basestation, index_col="NUM-3") print data 输出: ID NUM-1 NUM-2 NUM-3 661 36901 142 168 602 36902 78 521 521 36903 144 600 468 36904 95 457 695 36905 69 596
(7)names参数: 指定列的名字。
data = pd.read_excel(basestation,names=["a","b","c","e"]) print data a b c e 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695
存储函数为pd.DataFrame.to_excel(),注意,必须是DataFrame写入excel, 即Write DataFrame to an excel sheet。其具体参数如下:
to_excel(self, excel_writer, sheet_name='Sheet1', na_rep='', float_format=None,columns=None, header=True, index=True, index_label=None,startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None)
- - excel_writer : string or ExcelWriter object File path or existing ExcelWriter目标路径
- - sheet_name : string, default ‘Sheet1' Name of sheet which will contain DataFrame,填充excel的第几页
- - na_rep : string, default ”,Missing data representation 缺失值填充
- - float_format : string, default None Format string for floating point numbers
- - columns : sequence, optional,Columns to write 选择输出的的列。
- - header : boolean or list of string, default True Write out column names. If a list of string is given it is assumed to be aliases for the column names
- - index : boolean, default True,Write row names (index)
- - index_label : string or sequence, default None, Column label for index column(s) if desired. If None is given, andheader and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
- - startrow :upper left cell row to dump data frame
- - startcol :upper left cell column to dump data frame
- - engine : string, default None ,write engine to use - you can also set this via the options,io.excel.xlsx.writer, io.excel.xls.writer, andio.excel.xlsm.writer.
- - merge_cells : boolean, default True Write MultiIndex and Hierarchical Rows as merged cells.
- - encoding: string, default None encoding of the resulting excel file. Only necessary for xlwt,other writers support unicode natively.
- - inf_rep : string, default ‘inf' Representation for infinity (there is no native representation for infinity in Excel)
- - freeze_panes : tuple of integer (length 2), default None Specifies the one-based bottommost row and rightmost column that is to be frozen
ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695 5 36906 165 453 加载数据: basestation ="F://python/data/test.xls" basestation_end ="F://python/data/test_end.xls" data = pd.read_excel(basestation)
data.to_excel(basestation_end) 输出: ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695 5 36906 165 453
data.to_excel(basestation_end,na_rep="NULL") 输出: ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695 5 36906 165 453 NULL
(4) colums参数: sequence, optional,Columns to write 选择输出的的列。
data.to_excel(basestation_end,columns=["ID"]) 输出 ID 0 36901 1 36902 2 36903 3 36904 4 36905 5 36906
(5)header 参数: boolean or list of string,默认为True,可以用list命名列的名字。header = False 则不输出题头。
data.to_excel(basestation_end,header=["a","b","c","d"]) 输出: a b c d 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695 5 36906 165 453 data.to_excel(basestation_end,header=False,columns=["ID"]) header = False 则不输出题头 输出: 0 36901 1 36902 2 36903 3 36904 4 36905 5 36906
(6)index : boolean, default True Write row names (index)
默认为True,显示index,当index=False 则不显示行索引(名字)。
index_label : string or sequence, default None
data.to_excel(basestation_end,index=False) 输出: ID NUM-1 NUM-2 NUM-3 36901 142 168 661 36902 78 521 602 36903 144 600 521 36904 95 457 468 36905 69 596 695 36906 165 453 data.to_excel(basestation_end,index_label=["f"]) 输出: f ID NUM-1 NUM-2 NUM-3 0 36901 142 168 661 1 36902 78 521 602 2 36903 144 600 521 3 36904 95 457 468 4 36905 69 596 695 5 36906 165 453
