He Kaisheng

Results 6 comments of He Kaisheng

The output type is wrong in `DataFrameDropDuplicates`'s tile, here is the related code https://github.com/mars-project/mars/blob/86bbdd0e63e04fa278c240b5398bb895310c84c1/mars/dataframe/base/_duplicate.py#L151-L155 For series input, output type should always be series.

To simplify: ``` Python In [6]: rs = np.random.RandomState(0) ...: raw_df = rs.rand(20, 10) ...: raw_df = pd.DataFrame( ...: np.where(raw_df > 0.4, raw_df, np.nan), columns=list("ABCDEFGHIJ") ...: ) ...: raw_df2 =...

Looks reasonable, `batch_get` may also be useful for other storage backends.

After digging into, found that this is a Pandas issue: ``` Python In [20]: a = pd.DataFrame({'a':['a','b', 'c'] * 5, 'b': ['d', 'e', 'f'] * 5, 'c': range(15)}).astype({'a': "category", "b":...

> @hekaisheng You're right, I'll make a pr soon. Thanks!

Please copy-paste your code and error message instead of screenshots.