plotly.R icon indicating copy to clipboard operation
plotly.R copied to clipboard

Grouping by color and/or symbol changes the order of error_y bars

Open angiachino opened this issue 6 years ago • 3 comments

Reopening issues #762 and #1110, this is still happening in 2019 and maybe I found the cause. Briefly, error bars displayed by error_x and error_y appear in the wrong order when grouping data by color or (as I found) symbol. Recycling @Cristoforetti 's code from #1110:

df<-data.frame("X"=c(1:20),
"Y"=c(1:20),
"SD"=c(1:20),
"G"=c(rep("A",3),rep("B",5),rep("C",4),rep("D",5),rep("A",3)),
"g"=c(rep("a",2),rep("b",5),rep("c",4),rep("d",5),rep("e",4))
)
# no grouping; error bars correct
p1<-plot_ly(df,
            x=~X,
            y=~Y,
            type="scatter",
            mode="markers",
            error_y =list(
              array=~SD,
              thickness=1
            )
)
# grouping by color; error bars wrong
p2<-plot_ly(df,
            x=~X,
            y=~Y,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
subplot(p1,p2)

image I found that the correct behaviour (error bars associated with the correct data points) can be restored by passing a version of the input dataframe order-ed by the color column:

# ordering the input data frame by the color column yields the correct behaviour
p3<-plot_ly(df[order(df$G),],
            x=~X,
            y=~Y,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
subplot(p1,p2,p3)

image The same happens when ordering by symbol: the error bars are screwed up unless df is order-ed by the symbol column. When using both color and symbol, one must order by both the color column and the symbol column in this order:

# using "G" for color and "g" for symbols; ordering by G, then g yields the correct behaviour
p4<-plot_ly(df[order(df$G,df$g),],
            x=~X,
            y=~Y,
            symbol=~g,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
# ordering by g, then G yields the wrong behaviour
p5<-plot_ly(df[order(df$g,df$G),],
            x=~X,
            y=~Y,
            symbol=~g,
            color=~G,
            type="scatter",
            mode="markers",
            error_y =list(array=~SD,
                          thickness=1
            )
)
subplot(p4,p5)

image

What seems to be happening here is that color and symbol are reordering the (copy of) df handled by plotl_ly, but for some reason the reordering only affects the columns identified by x, y, color and symbol; as a consequence, the column used by error_y is now in the wrong order, and error bars are associated with the wrong data points. Ordering the df prior to calling plot_ly, or passing an ordered version of it, solves the issue. Still, it would be great to see this fixed in future versions of plotly.

angiachino avatar Nov 13 '19 15:11 angiachino

It seems that this issue is still not fixed in 2023. Thanks a lot @angiachino for your solution.

Dong

YonghuiDong avatar Jan 13 '23 16:01 YonghuiDong

This is still an issues in May 2024, Thank you so much @angiachino for your solution!

rogercherry avatar May 21 '24 14:05 rogercherry

This is still an issue in February 2025. @angiachino 's solution was incredibly helpful!

erikamking avatar Feb 24 '25 17:02 erikamking