Huge memory footprint of expressions
Using ByteCount (specifically, using asizeof module), one can see a huge consumption of memory to store Mathics expressions.
For example, storing List[1, 2] takes about 38x the amount MMA takes. Things get worse if using a 64 bit CPython - the amount almost doubles. Storing Range[1000000] (of course, unrolled) takes for instance 584699560 bytes - it's almost 600 MB! - while MMA takes 8000144 bytes - 8 MB ca.).
I guess CPython does some caching and optimizations - in fact there's a 5-10% difference between the value from Task Manager and the value from ByteCount.
That's a serious problem at all. May pickle and compression help?
While concerning, this makes sense. We have to allocate a whole lot of Integer objects.
I think a partial solution to this is packed arrays. I did some testing and I think that's how MMA handles this too:
angus@thinkpad> math ~
Mathematica 10.4.1 for Linux x86 (64-bit)
Copyright 1988-2016 Wolfram Research, Inc.
In[1]:= x = Range[1000000];
In[2]:= ByteCount[x]
Out[2]= 8000144
In[3]:= x[[1000000]] = 1
Out[3]= 1
In[4]:= ByteCount[x]
Out[4]= 8000144
In[5]:= x[[1000000]] = y (* Array can no longer be packed *)
Out[5]= y
In[6]:= ByteCount[x]
Out[6]= 24000064
In the long term, reducing the memory usage of the core data structures could be done by e.g. rewriting in C or possibly even just allocating raw python ints:
$> python
Python 3.5.2 (default, Jun 28 2016, 08:46:01)
[GCC 6.1.1 20160602] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = list(range(1000000))
>>> import sys
>>> sys.getsizeof(a)
9000112
Or maybe even trying to use __slots__: http://tech.oyster.com/save-ram-with-python-slots/
Also, on my system things aren't quite as bad (using #560):
In[1]:= ByteCount[Range[1000000]]
Out[1]= 297396384
@rocky this still isn't closed so I think it hasn't been solved.
Maybe a NumPy array would help here?
I don't think this has been addressed although the side issues have.
Working out all of the details in implementing Mathics Lists as Numpy arrays seems to me tricky. You'd need to know when it gains and would need to materialize the array on demand.
You'd need to know when it gains and would need to materialize the array on demand.
@rocky what do you mean by that?
Working out all of the details in implementing Mathics Lists as Numpy arrays seems to me tricky.
Yes, but the memory footprint would be much smaller. I would try to implement that, but I don't know where do it, in the class List, RowBox?
You'd need to know when it gains and would need to materialize the array on demand.
@rocky what do you mean by that?
Supposed I want to implement NumpyArrays for Python lists. But there is overhead to convert a Python list into a NumpyArray. And I guess I need to convert back from Numpy to Python. This is overhead. For many small Python lists this won't be of any benefit because just the time spent to convert in and out.
Working out all of the details in implementing Mathics Lists as Numpy arrays seems to me tricky.
Yes, but the memory footprint would be much smaller. I would try to implement that, but I don't know where do it, in the class
List,RowBox?
Lists are fundamental in Mathics and WL. All expressions are lists. I think this would be very difficult and honestly I am not sure I would know how to do.
Supposed I want to implement NumpyArrays for Python lists. But there is overhead to convert a Python list into a NumpyArray. And I guess I need to convert back from Numpy to Python. This is overhead. For many small Python lists this won't be of any benefit because just the time spent to convert in and out.
I think on this and implement it in mathics scanner would cut the overhead, as string to list has overhead, the same happen with string to NumPy but much smaller.
Lists are fundamental in Mathics and WL. All expressions are lists.
Thanks by the information, this will help.
I think this would be very difficult and honestly I am not sure I would know how to do.
I'll try to do it, I'll have time next week, the tests are going to help a lot here (help in my success or in my fail).