Description
I imagine the following request is popular enough that it has probably been asked before. I just couldn't find it. Feel free to close if the discussion has already occurred.
Consider the following R example:
# R code
> df <- data.frame(A=c(0:4), B=c("str0","str1","str2","str3","str4"), C=(1:5))
> df
A B C
1 0 str0 1
2 1 str1 2
3 2 str2 3
4 3 str3 4
5 4 str4 5
The equivalent Pandas syntax would be:
# python code
> df = pd.DataFrame({ "A":range(0,5), "B":["str"+str(x) for x in range(0,5)], "C":range(1,6) })
> df
A B C
0 0 str0 1
1 1 str1 2
2 2 str2 3
3 3 str3 4
4 4 str4 5
Now when trying to slice a R dataframe, the syntax allows direct access with the brackets operator:
# R code
> df[1,]
A B C
1 0 str0 1
The pandas equivalent would be:
# python code
> df.iloc[0,]
A 0
B str0
C 1
Name: 0, dtype: object
Disregarding the 0-index vs 1-index between the languages, I would like to propose adding a new slicing operation to the Pandas dataframe getitem()
that overloads the brackets operator and matches the behavior of R-style syntax. It would allow for cleaner and less-verbose code (especially when chaining multiple slice operations).
API breaking implications
The API change would be a positive addition. No removal of current slicing operations (iloc
, loc
, etc.)
Additional context
The following slicing examples will error out in Pandas whereas it would be valid slicing in R:
# python code
> df[0]
> df[0, ]
> df[0,0]
> df[0, 1:2]
Note that calling iloc
with the above commands would all succeed. What I'm suggesting is that the logic for iloc
be copied/moved into Pandas dataframe __getitem__()
function.