Skip to content

Allow no returns from expanding_apply #5881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jseabold opened this issue Jan 8, 2014 · 11 comments
Closed

Allow no returns from expanding_apply #5881

jseabold opened this issue Jan 8, 2014 · 11 comments
Labels
API Design Numeric Operations Arithmetic, Comparison, and Logical operations Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@jseabold
Copy link
Contributor

jseabold commented Jan 8, 2014

related #4130

I haven't looked at the code at how easy this is to do, but it looks like things like expanding_apply (moving_apply too I suspect) expects a float returned from the function. It might make sense to have an optional no_return flag (or just be smart about it, though perhaps it makes sense to have a flag as a sanity check for the user.)

Use case, I'm using expanding_apply to run some code in a subprocess and write to disk. I'm able to get around the expectation by having my function return 0.0 or something, but it's a bit of a hack.

@jreback
Copy link
Contributor

jreback commented Jan 8, 2014

see the linked issue; these rolling/expanding methods need an expanded API to deal with better input types and output types.

@jseabold
Copy link
Contributor Author

jseabold commented Jan 8, 2014

Thanks, I promise I did search this time first. Feel free to close this as a dupe.

@jreback
Copy link
Contributor

jreback commented Jan 8, 2014

no...going to leave it open as it actually is different that the other issues (which are more on the input side)...

now...just need someone to do this !

@jseabold
Copy link
Contributor Author

jseabold commented Mar 7, 2014

Just came across this. plyr has a function a_ply that throws away the results for calls that are only wanted for their side effects. There are actually a whole suite of apply-type functions that only differ in the object that they return. Don't know how useful that is in Python land, but just thought I'd park it here.

http://cran.r-project.org/web/packages/plyr/index.html

@jseabold
Copy link
Contributor Author

jseabold commented Apr 7, 2014

Parking this here because it's a similar issue. I just ran into this now that read_stata correctly returns float32 (default in stata). expanding_apply and friends always return float (64-bit on my machine).

I was doing an expanding_apply on a series via groupby and it borked because it can't safely cast 64-bit to 32-bit. The workaround is to cast the series beforehand to float64, but it might be nice if you can specify the return type of your function beforehand.

I think the solution would just be a matter of adding a few different version of algos.roll_generic and having a return_type argument to the (generic) rolling functions. E.g., return_type in [None, np.float32, np.float64, ...]. AFAIK there's no good way at runtime to determine the return type of the function.

@jreback
Copy link
Contributor

jreback commented Apr 7, 2014

see here: #3007

really just a matter of updating the code generator

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Apr 7, 2014
@jseabold
Copy link
Contributor Author

jseabold commented Apr 7, 2014

I'm not sure fused types are the solution here since you don't give the output array to the function. I suppose you could create the output array outside of Cython but it might be slow.

Actually it might not be any slower than the current code because the creation of the output array is python code. I don't think cimporting numpy does any magic here. roll_generic could be optimized by using the numpy C-api for creating, iterating over and assigning to the output array. This is done elsewhere in the code base IIRC.

@jseabold
Copy link
Contributor Author

jseabold commented Apr 7, 2014

I guess you could try to infer the return type from the input array but that assumes func returns the same type.

@jreback
Copy link
Contributor

jreback commented Apr 7, 2014

the fused types reference is a general comment; instead of using a code generator idea was to use afused type (which cython based just auto types for you). its really just a code simplicty.

actually very little of the code uses the numpy C-api. and in fact trying to NOT do that. cython api translates it anyhow.

@jreback
Copy link
Contributor

jreback commented Apr 7, 2014

my comment about 3007 was inregards to your comment (not this issue really), except that it needs to handle other dtypes

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
@mroeschke
Copy link
Member

Dtypes for window ops is covered in #11446 so closing in favor of that issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Numeric Operations Arithmetic, Comparison, and Logical operations Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

4 participants