Skip to content

Rcpp::Language appears to deep-clone arguments #1386

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
qddyy opened this issue Jun 2, 2025 · 10 comments · Fixed by #1388
Closed

Rcpp::Language appears to deep-clone arguments #1386

qddyy opened this issue Jun 2, 2025 · 10 comments · Fixed by #1388

Comments

@qddyy
Copy link

qddyy commented Jun 2, 2025

I've observed a difference in behavior between Rcpp::Language and the lower-level Rf_lang* functions when constructing function calls.

To my understanding, function calls in R are essentially pairlists (LANGSXP) that hold SEXP — that is, pointers to R objects. As such, if an argument is modified after the call is constructed, the change should be visible during evaluation, since the call stores only pointers, not copies.

However, calls created using Rcpp::Language do not reflect such modifications. It appears that the arguments are deep-copied during the construction of the call. This behavior did not occur in Rcpp 1.0.13, but it does appear in Rcpp 1.0.14.

Here is a minimal reproducible example:

Rcpp::cppFunction("
void test_call(Function f) {
    auto v = NumericVector::create(0.0, 1.0);

    Rcpp::Language call_Rcpp(f, v);

    Rcpp::Shield<SEXP> call_lang(Rf_lang2(f, v));

    v[0] = 999.0;

    Rcpp::Rcout << as<double>(Rcpp_fast_eval(call_Rcpp, R_GlobalEnv)) << std::endl;
    Rcpp::Rcout << as<double>(Rcpp_fast_eval(call_lang, R_GlobalEnv)) << std::endl;
}
")

test_call(sum)
#> 1
#> 1000

I'm wondering whether this change in behavior was intentional, or if it might be an unintended consequence of recent updates?

@eddelbuettel
Copy link
Member

Thanks for opening a bug report, and I can confirm. With a slightly modified version of your script also showing the package version I have e.g. (in a container after installing 1.0.13-1)

root@a5a6d7fb905b:/work# Rscript -e 'Rcpp::sourceCpp("issue1386.cpp")'

> test_call(sum)
1000
1000

> packageVersion("Rcpp")
[1] ‘1.0.13.1’
root@a5a6d7fb905b:/work# 

yet 1.0.14 releases last January shows the effect you isolated. I may have a go at bisecting the repo.

@Enchufa2
Copy link
Member

Enchufa2 commented Jun 2, 2025

The most probable change is this one: dfb530d

@eddelbuettel
Copy link
Member

Confirmed!! Good call, @Enchufa2 !

Bad at dfb530d

root@7b5e5fd39ae1:/work# git checkout dfb530d7 
Note: switching to 'dfb530d7'.

You are in 'detached HEAD' state. You can look around, make experimental

changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at dfb530d7 avoid using SET_TYPEOF (#1315)
root@7b5e5fd39ae1:/work# install.r
* installing *source* package found in current working directory ...
* installing *source* package ‘Rcpp’ ...
** this is package ‘Rcpp’ version ‘1.0.13.0.2’
** using staged installation
** libs
using C++ compiler: ‘g++ (Debian 14.2.0-19) 14.2.0’
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c api.cpp -o api.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c attributes.cpp -o attributes.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c barrier.cpp -o barrier.o
g++ -std=gnu++17 -shared -L/usr/lib/R/lib -Wl,-z,relro -o Rcpp.so api.o attributes.o barrier.o date.o module.o rcpp_init.o -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/00LOCK-work/00new/Rcpp/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (Rcpp)
root@7b5e5fd39ae1:/work# Rscript -e 'Rcpp::sourceCpp("issue1386.cpp")'

> test_call(sum)
1
1000

> packageVersion("Rcpp")
[1] ‘1.0.13.0.2’
root@7b5e5fd39ae1:/work# 

Good at c1699a0

root@7b5e5fd39ae1:/work# git checkout c1699a01                                                           
Note: switching to 'c1699a01'.                                                                           
                                                                                                         
You are in 'detached HEAD' state. You can look around, make experimental                                 
changes and commit them, and you can discard any commits you make in this                                                                                                                                          
state without impacting any branches by switching back to a branch.
                                                    
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:                                       
                                                                                                         
  git switch -c <new-branch-name>                                                                        
                                                                                                         
Or undo this operation with:                                                                             
                                                                                                                                                                                                                   
  git switch -             
                                                    
Turn off this advice by setting config variable advice.detachedHead to false
                                                    
HEAD is now at c1699a01 Roll micro-release, update NEWS [ci ski]
root@7b5e5fd39ae1:/work# ./cleanup                                                                       
root@7b5e5fd39ae1:/work# install.r                                                                       
* installing *source* package found in current working directory ...
* installing *source* package ‘Rcpp’ ...
** this is package ‘Rcpp’ version ‘1.0.13.0.1’                                                           
** using staged installation                                                                             
** libs
using C++ compiler: ‘g++ (Debian 14.2.0-19) 14.2.0’
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werr
or=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c api.cpp -o api.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werr
or=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c attributes.cpp -o attributes.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werr
or=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c barrier.cpp -o barrier.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werr
or=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c date.cpp -o date.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werr
or=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c module.cpp -o module.o
g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/      -fpic  -g -O2 -ffile-prefix-map=/build/reproducible-path/r-base-4.5.0=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werr
or=format-security -fcf-protection -Wdate-time -D_FORTIFY_SOURCE=2   -c rcpp_init.cpp -o rcpp_init.o
g++ -std=gnu++17 -shared -L/usr/lib/R/lib -Wl,-z,relro -o Rcpp.so api.o attributes.o barrier.o date.o module.o rcpp_init.o -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/00LOCK-work/00new/Rcpp/libs
** R                                               
** inst                                                                                                                                                                                                            ** byte-compile and prepare package for lazy loading                                    
** help                                                                                                                                                                                                            *** installing help indices                                                                              
** building package indices                                                                                                                                                                                        ** installing vignettes                                                                                  
** testing if installed package can be loaded from temporary location                                                                                                                                              
** checking absolute paths in shared objects and dynamic libraries     
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (Rcpp)                                                                                            
root@7b5e5fd39ae1:/work# Rscript -e 'Rcpp::sourceCpp("issue1386.cpp")'                                    
                                                    
> test_call(sum)           
1000                   
1000                                                                                                     
                                                                                                         
> packageVersion("Rcpp")                                                                                 
[1] ‘1.0.13.0.1’                                                                                         
root@7b5e5fd39ae1:/work# 

And those two were adjacent:

d9b54315 * Roll micro version, update date, update NEWS
dfb530d7 * avoid using SET_TYPEOF (#1315)
c1699a01 * Roll micro-release, update NEWS [ci ski]
d303f9e1 * drop support for user-defined databases (#1314)
19328ad6 * 1.0.13 Release 1.0.13

@eddelbuettel
Copy link
Member

And it looks like it may be the first of the changes in there ie

Image

Going back to the outlawed SET_TYPEOF restores the behavior in the test file.

@qddyy
Copy link
Author

qddyy commented Jun 2, 2025

Thanks for the quick response!

Based on the discussion so far, I think I may have found the root cause of the issue. When a Language object is constructed, it passes a pairlist (LISTSXP) to Storage::set__

Storage::set__(pairlist(function, t...));

which then forwards it to the update method
static_cast<CLASS&>(*this).update(data) ;

Within update, the pairlist is passed to r_cast<LANGSXP>

Storage::set__(r_cast<LANGSXP>(x));

which eventually calls r_true_cast

return internal::r_true_cast<TARGET>(x); // #nocov

However, r_true_cast<LANGSXP> internally uses R's as.call

return convert_using_rfunction(x, "as.call" );

which, based on a small test I just ran, seems to perform a deep copy when converting a pairlist into a function call.

Here's the code I used:

Rcpp::cppFunction('
void test_R_as_call(Function f) {
    auto v = NumericVector::create(0.0, 1.0);

    Rcpp::Shield<SEXP> p_lst(pairlist(f, v));

    Rcpp::Function R_as_call("as.call");

    Rcpp::Shield<SEXP> call(R_as_call(p_lst));

    v[0] = 999.0;

    Rcpp::Rcout << as<double>(Rcpp_fast_eval(call, R_GlobalEnv)) << std::endl;
}
')

test_R_as_call(sum)
#> 1

@eddelbuettel
Copy link
Member

Nice work. While you are in the weeds, can you see a way to treat pairlist objects differently?

@qddyy
Copy link
Author

qddyy commented Jun 2, 2025

It seems the deep clone occurs here

The duplicate call appears to enforce R’s copy-on-write semantics. Making r_true_cast<LANGSXP>(pairlist) bypass it would require manually calling SET_TYPEOF, as done in do_ascall. Unfortunately, this is not allowed...

Given the complexity of converting a LISTSXP directly into a LANGSXP, I think using Rf_lang*/Rcpp_lang* directly in Rcpp::Language instead of starting from a pairlist might be better. However, this still does not solve the issues with r_cast<LANGSXP>. It seems that Rcpp follows C++ semantics rather than R’s copy-on-write model. Am I understanding that correctly?

@kevinushey
Copy link
Contributor

kevinushey commented Jun 2, 2025

Given the complexity of converting a LISTSXP directly into a LANGSXP, I think using Rf_lang*/Rcpp_lang* directly in Rcpp::Language instead of starting from a pairlist might be better.

I agree with this as well; this would be the most appropriate fix.

However, this still does not solve the issues with r_cast. It seems that Rcpp follows C++ semantics rather than R’s copy-on-write model. Am I understanding that correctly?

Can you elaborate on what you mean here? As I understand it, r_cast<> will check if the object has the expected type, and if so, leave it alone; otherwise, produce a new object as an appropriately type-coerced version of the input.

I guess it's not a "cast" in the way we think of it in C++, where we might reinterpret an existing object as another object -- it's really just a "coerce" or "convert" kind of method?

@qddyy
Copy link
Author

qddyy commented Jun 3, 2025

Can you elaborate on what you mean here? As I understand it, r_cast<> will check if the object has the expected type, and if so, leave it alone; otherwise, produce a new object as an appropriately type-coerced version of the input.

The issue we observed with Language appears to stem mainly from an implicit copy triggered by r_cast<LANGSXP>(LISTSXP), which we hadn't fully realized before. This copying is primarily due to R's copy-on-write semantics, and as such, it doesn't seem to align with Rcpp's typical behavior. My original concern is that this unnoticed copy might lead to similar issues elsewhere.

I guess it's not a "cast" in the way we think of it in C++, where we might reinterpret an existing object as another object -- it's really just a "coerce" or "convert" kind of method?

Thanks to your explanation. It seems my earlier concern may not actually apply. Cases where r_cast<LANGSXP>(LISTSXP) truly requires non-copying behavior are probably quite rare.

@eddelbuettel
Copy link
Member

Really nice work by both of you on the initial report and the follow-up, and the cleanup PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants