Limits on Python Lists? -

- June 15, 2011

i'm trying assimilate bunch of information usable array this:

for (dirpath, dirnames, filenames) in walk('e:/machin lerning/econ/full_set'):     ndata.extend(filenames) in ndata:     currfile = open('e:/machin lerning/econ/full_set/' + str(i),'r')     rawdata.append(currfile.read().splitlines())     currfile.close() rawdata = numpy.array(rawdata)  order,file in enumerate(rawdata[:10]):     in rawdata[order]:         r = i.split(',')         pdata.append(r)     fdata.append(pdata)     pdata = [] fdata = numpy.array(fdata) plt.figure(1) plt.plot(fdata[:,1,3])

edit: after printing ftada.shape when using first 10 txt files

for order,file in enumerate(rawdata[:10]):

i see (10, 500, 7). if not limit size of this, , instead

for order,file in enumerate(rawdata):

then fdata.shape (447,) seems happens whenever increase number of elements through in rawdata array above 13... it's not specific location either - changed

for order,file in enumerate(rawdata[11:24):

and worked fine. aaaaahhh in case it's useful: here's sample of text files looks like:

20080225,a,31.42,31.79,31.2,31.5,30575   20080225,aa,36.64,38.95,36.48,38.85,225008   20080225,aapl,118.59,120.17,116.664,119.74,448847

looks fdata array, , error in fdata[:,1,3]. tries index fdata 3 indices, slice, 1, , 3. if fdata 2d array, produce error - too many indices.

when 'indexing' errors, figure out shape of offending array. don't guess. add debug statement print(fdata.shape).

===================

taking file sample, list of lines:

in [822]: txt=b"""20080225,a,31.42,31.79,31.2,31.5,30575        ...: 20080225,aa,36.64,38.95,36.48,38.85,225008        ...: 20080225,aapl,118.59,120.17,116.664,119.74,448847 """ in [823]: txt=txt.splitlines()  in [826]: fdata=[] in [827]: pdata=[]

read 1 'file':

in [828]: in txt:      ...:     r=i.split(b',')      ...:     pdata.append(r)      ...: fdata.append(pdata)      ...:       ...:      in [829]: fdata out[829]:  [[[b'20080225', b'a', b'31.42', b'31.79', b'31.2', b'31.5', b'30575  '],   ....]]] in [830]: np.array(fdata) out[830]:  array([[[b'20080225', b'a', b'31.42', b'31.79', b'31.2', b'31.5',          b'30575  '], ...]]],        dtype='|s8') in [831]: _.shape out[831]: (1, 3, 7)

read 'identical file"

in [832]: in txt:      ...:     r=i.split(b',')      ...:     pdata.append(r)      ...: fdata.append(pdata)  in [833]: len(fdata) out[833]: 2 in [834]: np.array(fdata).shape out[834]: (2, 6, 7) in [835]: np.array(fdata).dtype out[835]: dtype('s8')

note dtype - string of 8 characters. since on value per line string, can't convert whole thing numbers.

now read different 'file' (one less line, 1 less value)

in [836]: txt1=b"""20080225,a,31.42,31.79,31.2,31.5,30575        ...: 20080225,aa,36.64,38.95,36.48,38.85 """ in [837]: txt1=txt1.splitlines() in [838]: in txt1:      ...:     r=i.split(b',')      ...:     pdata.append(r)      ...: fdata.append(pdata)  in [839]: len(fdata) out[839]: 3 in [840]: np.array(fdata).shape out[840]: (3, 8) in [841]: np.array(fdata).dtype out[841]: dtype('o')

now lets add 'empty' file - no rows pdata []

in [842]: fdata.append([]) in [843]: np.array(fdata).shape out[843]: (4,) in [844]: np.array(fdata).dtype out[844]: dtype('o')

array shape , dtype have totally changed. can no longer create uniform 3d array lines.

the shape after 10 files, (10, 500, 7), means 10 files, 500 lines each, 7 columns each line. 1 file or more of full 400 different. last iteration suggests 1 empty.

Search This Blog

Swift

Limits on Python Lists? -

Comments

Post a Comment

Popular posts from this blog

asp.net - How to correctly use QUERY_STRING in ISAPI rewrite? -

jsf - "PropertyNotWritableException: Illegal Syntax for Set Operation" error when setting value in bean -

laravel - Undefined property: Illuminate\Pagination\LengthAwarePaginator::$id (View: F:\project\resources\views\admin\carousels\index.blade.php) -