å建æ°æ®
éè¿Pythonçzipæé åºä¸å
ç»ç»æçå表ä½ä¸ºDataFrameçè¾å
¥æ°æ®recã
In [3]: import pandas as pd
In [4]: import random
In [5]: num = random.sample(xrange(10000, 1000000), 5)
In [6]: num
Out[6]: [244937, 132008, 278446, 613409, 799201]
In [8]: names = "hello the cruel world en".split()
In [9]: names
Out[9]: ['hello', 'the', 'cruel', 'world', 'en']
In [10]: rec = zip(names, num)
In [15]: data = pd.DataFrame(rec, columns = [u"å§å",u"ä¸ç»©" ])
In [16]: data
Out[16]:
å§å ä¸ç»©
0 hello 244937
1 the 132008
2 cruel 278446
3 world 613409
4 en 799201
DataFrameæ¹æ³å½æ°ç第ä¸ä¸ªåæ°æ¯æ°æ®æºï¼ç¬¬äºä¸ªåæ°columnsæ¯è¾åºæ°æ®è¡¨ç表头ï¼æè
说æ¯è¡¨æ ¼çå段åã
导åºæ°æ®csv
Windowså¹³å°ä¸çç¼ç é®é¢ï¼æ们å¯ä»¥å
å个ç®åå¤çï¼æ¯ipython-notebookæ¯æutf8.
import sys
reload(sys)
sys.setdefaultencoding("utf8")
æ¥ä¸æ¥å¯ä»¥æ°æ®å¯¼åºäºã
In [31]: data
Out[31]:
å§å ä¸ç»©
0 hello 244937
1 the 132008
2 cruel 278446
3 world 613409
4 en 799201
#å¨ipython-noteéåå é®å·å¯æ¥å¸®å©ï¼qéåºå¸®å©
In [32]: data.to_csv?
In [33]: data.to_csv("c:\\out.csv", index = True, header = [u"éå", u"éå®ä¸ç»©"])
å°data导åºå°out.csvæ件éï¼indexåæ°æ¯ææ¯å¦æ主索å¼ï¼headerå¦æä¸æå®åæ¯ä»¥dataécolumns为头ï¼å¦ææå®åæ¯ä»¥åè¾¹å表éçå符串为表头ï¼ä½è¦æ³¨æçæ¯headeråçå符串å表ç个æ°è¦ådataéçcolumnså段个æ°ç¸åã
å¯å°cçç¨Notepad++æå¼out.csvççã
ç®åçæ°æ®åæ
In [43]: data
Out[43]:
å§å ä¸ç»©
0 hello 244937
1 the 132008
2 cruel 278446
3 world 613409
4 en 799201
#æåºå¹¶ååä¸å
In [46]: Sorted = data.sort([u"ä¸ç»©"], ascending=False)
Sorted.head(3)
Out[46]:
å§å ä¸ç»©
4 en 799201
3 world 613409
2 cruel 278446
å¾å½¢è¾åº
In [71]: import matplotlib.pyplot as plt
#使ipython-notebookæ¯æmatplotlibç»å¾
%matplotlib inline
In [74]: df = data
#ç»å¾
df[u"ä¸ç»©"].plot()
MaxValue = df[u"ä¸ç»©"].max()
MaxName = df[u"å§å"][df[u"ä¸ç»©"] == df[u"ä¸ç»©"].max()].values
Text = str(MaxValue) + " - " + MaxName
#ç»å¾æ·»å ææ¬æ 注
plt.annotate(Text, xy=(1, MaxValue), xytext=(8, 0), xycoords=('axes fraction', 'data'), textcoords='offset points')
å¦æ注éæplt.annotateè¿è¡
温馨提示:答案为网友推荐,仅供参考