p5.js icon indicating copy to clipboard operation
p5.js copied to clipboard

[2.0] Proposal: A clearer API for vector components (no breaking changes)

Open GregStanton opened this issue 3 months ago • 12 comments

[2.0] Proposal: A clearer API for vector components (no breaking changes, minimal effort)

The current 2.0 API for getting and setting vector components contains ambiguity and complexity that may cause user confusion. It also sets a difficult precedent for future classes like Series and Sheet (replacements for the deprecated TableRow and Table), as well as Matrix and potentially Tensor. This proposal outlines a complete solution that is clearer and more consistent. It can also be implemented without breaking changes and without major effort.

Current 2.0 API Proposed API Breaking?
values getElements() / setElements() No, values hasn’t appeared in the reference yet.
getValue() / setValue() getElement() / setElement() No, we can deprecate getValue() / setValue().
set() (deprecated, replaced by setElements()) No, this is a deprecation.

The problem with the current API

The unreleased values property is the most significant issue, but set() also creates redundancy.

  • Ambiguous naming: The name values is vague. It could refer to all properties of the vector, not just its components. Standard terms like "element" are more precise.
  • Ambiguous behavior: Because values is intended to replace the getter array(), users may reasonably assume it's read-only, but it's actually writable. This is a recipe for confusion. (I actually made this mistake, as a Math steward.)
  • Confusing redundancy: The API has two ways to set all components: the values property and the set() method. While set() is more powerful (accepting multiple overloads), its overlap with values creates confusion.

The proposed solution and implementation

This proposal solves all identified problems by replacing the current API with a clear, consistent, and extensible alternative.

Recommended API: getElement()/getElements(), setElement()/setElements()

This singular-plural pattern aligns with carefully designed p5.js features like splineProperty()/splineProperties(), providing a predictable experience for users.

The implementation can be broken into three simple, non-breaking tasks:

  • Replace values: The unreleased values property can be directly replaced with getElements() and setElements() methods. The internal logic (get/set keywords on a private _values array) can be reused, making this a minimal change.
  • Deprecate getValue()/setValue(): These can be deprecated in favor of their new names, getElement()/setElement(). Since they are new and unstable, removing them in 3.0 will cause near-zero disruption.
  • Deprecate set(): This method’s functionality is completely and more clearly covered by the new setElements() method.
A couple design details
  1. Although "entry" was originally proposed, "element" has only two extra characters and has a couple advantages. It's easier for non-English speakers to pluralize, and it doesn't conflict with the meaning of "entries" in the native array's entries() method.
  2. There is perhaps a very slight inconsistency with x, y, and z, since these are implemented as joint getters/setters, unlike the separate explicit getters and setters of the proposed API for general components. However, from a user perspective, these are simply numerical fields, which distinguishes them from the methods described in this proposal. Also, these fields are already special cases, as no other components are named (except possibly w).
Why getters/setters in p5 are better separate as class methods but joint as standalone functions

Purpose: This section proposes a long-term API pattern for getters and setters in p5. In the context of the current issue, it’s meant to validate the separate, explicit getter/setter pattern that’s currently used in p5.Vector for individual vector components. It also provides one reason to change from values, which has not yet appeared in the reference, to a separate getter and setter.

Since consistency is important, it's helpful to briefly consider the wider context of p5. The most basic features in p5 are standalone functions like stroke() and fill(). These functions do not have "get" or "set" prefixes. That’s likely beneficial, since a term like "set" feels more like a computer instruction, and it's important to use familiar vocabulary at this critical, early stage of learning to code. In any case, these features are in extremely wide use, and are among the first features that most users will learn, so they establish a very strong precedent.

At the same time, p5 often uses explicit “get” and “set” prefixes elsewhere in its API, especially in class methods. Currently, there isn’t an obvious, universal pattern for when these prefixes are used. These sorts of inconsistencies are a form of technical debt that tends to compound, so it's worthwhile to get the problem under control whenever we do a major release. In the long run, there seem to be a couple approaches for dealing with such inconsistencies that might be viable. Here, "long run" includes future releases such as 3.0.

Approach 1: Joint getters/setters everywhere Here the idea is to move toward converting all getters and setters to joint getters/setters, with the generic API template propertyName(). There may be exceptions, like when a property is meant to be read only, but this is the overall idea.

Approach 2: Joint getters/setters in standalone functions only Here the idea is to make all standalone functions like fill() into joint getters/setters, and to make all class-based getters and setters explicit, with "get" and "set" prefixes. Again, there may be special cases.

Weighing the approaches Each of these options has merits. Converting everything to joint getters/setters provides a completely uniform interface and an economical API. On the other hand, joint getters/setters like fill() are ambiguously named, compared to explicitly named, dedicated getters and setters. Determining a path forward requires prioritizing these trade-offs and accounting for the disruption of breaking changes.

There's a reasonable case to be made that we should move toward the direction of Approach 2, so that joint getters/setters are used for standalone functions only. If all relevant standalone features are joint getters/setters and all relevant class-based methods are separate, explicit getters and setters, then that should provide enough consistency for users to quickly learn the API.

That narrows down the decision to economy vs. clarity. While there are a lot of getters and setters, the total number of them will be multiplied by at most a factor of two, and in reality the growth factor is likely much smaller (separate getters and setters are often used already). Since economy is not likely to be a dealbreaker, clarity should likely be prioritized for a library like p5. And by the time users graduate to object-oriented programming, they're ready for longer, clearer names like getHeading().

Most importantly, a quick CTRL+F of the main reference page suggests that Approach 2 would require significantly fewer breaking changes.

The conclusion for p5.Vector is that separate, explicitly named getters and setters are likely to be preferable.

GregStanton avatar Oct 15 '25 12:10 GregStanton

I'm not sure I'm on board yet for making class methods use a separate pattern for getters/setters, or at least that that's the right spot to make the switch. Mostly, it's not clear yet to me why there should be a separate pattern for classes vs following the currently strong precedent used elsewhere, and currently my gut reaction is that the reason to switch, in general, isn't quite strong enough to warrant the complexity of a deprecation and a pattern change. (Do all the global methods which also exist on p5.Graphics need to change to be consistent with that rule?)

I think the general pattern in p5 as a whole is to default to joint getters/setters for properties when you're modifying the data directly, and to use separate getter/setters when you're working with a view of the data. So in vectors, the reason why we have get/setHeading is that a vector (in p5, currently) is not characterized by its heading and magnitude, it's characterized by its Cartesian coordinates. What we choose to be a data property vs a view is still up to us though.

I think having that separation of data/view is not a bad thing, as there are a lot of little issues you can run into when trying to treat what's really a view like data that create a lot of difficult maintenance work to avoid (e.g. values changing when you don't expect them due to, precision issues, other user confusion because their mental model is misaligned with what's actually happening.) There's definitely a strong argument to be made that naming clarity is more important than that, but I think it might not be strong enough to warrant the complexity of a switch in general in p5.

So to me I think it makes sense to keep that pattern for properties both on global state and class state so that the feel of using those parts of the library is largely the same. I'm definitely OK with using a more descriptive name than values but I'm personally in favour of still using a data property for coordinates.

davepagurek avatar Oct 16 '25 17:10 davepagurek

Thanks @davepagurek! I appreciate the discussion.

Mostly, it's not clear yet to me why there should be a separate pattern for classes vs following the currently strong precedent used elsewhere, and currently my gut reaction is that the reason to switch, in general, isn't quite strong enough to warrant the complexity of a deprecation and a pattern change.

This is really interesting because my understanding was that the pattern I'm proposing isn't a switch away from precedent, but rather a move towards it. I did a CTRL+F of the full API, and it looks like this pattern has the strongest precedent within p5.

I think the reason for our different perspectives is that I hadn't noticed the pattern you're suggesting, regarding data vs. views. I'm interested to look over the full p5 API again with this pattern in mind, and I totally need to do that.

I might have missed it because getValue()/setValue() don't seem to adhere to it, and the set() method doesn't seem to either. It also seems like eliminating the confusion between values() and set() is important.

What we choose to be a data property vs a view is still up to us though.

My current thinking is that this pattern is subtle and depends on the internal implementation, which an API is generally meant to hide. The heading and magnitude do characterize a vector via polar coordinates. Since there are often multiple ways of characterizing the same data, it seems like we'd often be asking users to guess the API pattern based on the implementation? To put it another way, if it's up to us whether specific information is data or a view, then it could go either way internally, so maybe it's better not to leak that choice into the API?

Do all the global methods which also exist on p5.Graphics need to change to be consistent with that rule?

I think there's a clear way for users to understand that this is a special case? All the usual standalone functions are attached with their usual names.

I'm definitely OK with using a more descriptive name than values

Sounds great. This seems especially important since the precedent will apply not just to Vector, but also potentially Matrix, Tensor, Series, and Sheet.

This is an interesting discussion, and I look forward to continuing it 😄 I always learn a lot from your perspective!

GregStanton avatar Oct 16 '25 18:10 GregStanton

This is really interesting because my understanding was that the pattern I'm proposing isn't a switch away from precedent, but rather a move towards it.

I think it definitely can be considered a move towards consistency. Despite what I've said about how data properties are pretty consistent, operations on a view are defined pretty differently still (e.g. we have heading() and setHeading(), not getHeading(), but we use get* as a prefix in other areas like local storage access.) Not in p5, but Processing's PShape does seem to follow the convention you describe. So if this is a common mental model already, I agree it makes sense to move in this direction. To other readers of this thread: if you have thoughts, please weigh in!

Since there are often multiple ways of characterizing the same data, it seems like we'd often be asking users to guess the API pattern based on the implementation?

While e.g. you could refer to a graph data structure as a matrix or as node objects with references to each other, I think it's reasonable for an API to take a stance on which representation the API is designed for, especially with the goal of avoiding hidden performance costs. That's I think the difference between leaking implementation details vs picking a representation. It's not a hard boundary of course, it's just that it's not a feasible design goal to make every characterization just as easy to use and performant, so it's the job of any API to make a choice, and to make that choice clear via the docs. (That could be via naming convention, but others do that differently, like how C++ tells you erasing from a vector has linear complexity but erasing from a list is constant due to their different representations.)

So I don't think making the user aware of the representation is a problem, but the API design could not be the right way to do that. Mostly I just want to set clear expectations for usage, and API design could be a tool in that. e.g. in WebGPU mode, you'll have to await loadPixels(), which we could go to great lengths to avoid, but in that situation, it's more productive for both us and users to make users aware that this actually is a heavier operation.

So, more vibes-based than principled, but to me get*/set* prefixes feel heavier to me, especially compared to the method chaining in the existing vector API, which might not be how we want every class method to feel.

davepagurek avatar Oct 16 '25 19:10 davepagurek

Hi @davepagurek, thanks again for the great back-and-forth on this. The discussion about long-term API patterns is super valuable. As I was thinking about it, I did a deeper dive into the current 2.0 implementation of p5.Vector, and I discovered some critical technical issues with the values property that I think we need to address.

These findings are specific to the current implementation, but they have major implications for the API design and seem to support moving toward the explicit method-based approach we've been discussing.

A closer look at the values property

My analysis revealed two fundamental problems—one with the user-facing API design, and a more critical one with the underlying implementation.

1. API-level confusion

The API currently provides both a values property and getValue(i)/setValue(i) methods. From a user's perspective, this is confusing:

  • If values is an array of the vector's components, why wouldn't a user just access it directly with myVector.values[i]?
  • This forces us to document a confusing pattern: "Here is a values property that looks like an array, but please don't modify it directly; use these other methods instead."
  • A method name like toArray() would be clearer, as it implies a one-way copy, but values implies direct access.

2. Critical implementation flaws

The current implementation of the values property is unfortunately broken in a way that guarantees bugs and inconsistent states.

  • Broken Encapsulation: The get values() method returns a live reference to the internal _values array. This means a user can write myVector.values[0] = 99 and mutate the internal state of the vector directly, bypassing any validation.
  • Guaranteed State Corruption: Worse, if a user performs a common array mutation like myVector.values.push(10), the set values() method is never triggered. This means the internal array's length changes, but the dimensions property is never updated. The vector object is now in a corrupted state, which will lead to frustrating, hard-to-diagnose bugs for our users.

The inevitable conclusion

These implementation flaws—especially the state corruption bug—make the property-based values approach untenable. To ensure a vector object can never enter a broken state, we must require all component modifications to go through methods that can properly manage the internal state.

This leads directly back to the API I originally proposed:

getElement(i) / setElement(i, value)

getElements() / setElements(array)

This method-based approach is the only way to guarantee a robust, bug-free implementation. It solves the encapsulation and state corruption issues, eliminates the confusing redundancy of values and getValue/setValue, and cleanly replaces the old set() method.

It seems these technical constraints provide a strong, independent reason to adopt this API structure, even before we settle the larger (and still very important!) discussion about library-wide getter/setter patterns.

What are your thoughts on this analysis?

GregStanton avatar Oct 17 '25 12:10 GregStanton

I think it's possible to keep the array access API for vectors if we make the dimension not be independent state, but something calculated from the values array, making the Vector class kind of like a wrapper around an array the way the gl-matrix library does. But the main downside of that is if we wanted to keep the same API for matrices. An array is already a pretty convenient way of interacting with vector data, but less so for matrices (a flat array representation like pixels or how gl-matrix works requires a lot of extra thought when accessing, and a nested array is harder to modify when changing shape, with more sub arrays to keep in sync.)

So I think having element based access makes sense. I think we probably still want ways to bulk set data from common representations, so maybe that means keeping set() around and having easy ways to get the data out in bulk too, like maybe toArray() for vectors, and maybe with an additional options object for matrices which could have a few common array representations.

About being able to mutate internal state via an array: At least for transformation matrices, we'll need to store internal data as a flat array and be able to access that without conversion costs, including copying, which was a big bottleneck when I was debugging matrix perf a few weeks ago. While an array converter method probably should return a copy for safety, maybe we also add something like .getArrayUnsafe() that directly returns the underlying array while making it clear it's got usage limitations? Naming inspired by Reac t which has a few methods like that, including "unsafe" or "dangerously" prefixes for things without safety checks.

davepagurek avatar Oct 17 '25 13:10 davepagurek

Thanks @davepagurek!

Revised API proposal

It looks like you're suggesting something along the lines of set()/toArray()/getElement()/setElement()? That was actually my backup API, since it's also a purely method-based approach. Thanks to your comment, I now realize it's likely better than the API I originally proposed, and it currently seems like the way to go.

Rationale

With everything else going on, I somehow overlooked the fact that I had already planned for a toArray() method on Color, Vector, Matrix, Transform, Tensor, ... with formatting options as appropriate (e.g. row major vs. column major, flat vs. nested). This is a key feature, since it can provide interoperability with matrices and tensors in other libraries. It's a good sign that you landed on the same design.

Also, I've been hoping to replace array() with toArray() for a long time now 😂 It's a clearer name, and moving to it improves consistency with toString(). That leads to a critical flaw in the design I originally proposed, since having both getElements() and toArray() would be redundant, and the name getElements() is less clear.

The revised API proposal also doesn't require set() to be deprecated.

Unsafe array feature

I'm really curious to hear more about getUnsafeArray()/getArrayUnsafe()/... Maybe we could have a separate discussion about that? Not that it's unrelated, but it sounds like that might be a longer conversation.

Next step: getter/setter inventory

I started putting together an inventory of all getters and setters in p5, classifying them according to the various patterns we discussed. I'm hoping that can help us come to stronger conclusions, and it should be a helpful resource to have in general.

GregStanton avatar Oct 17 '25 17:10 GregStanton

For the sake of brainstorming, I do like the symmetry of e.g. fromArray()/toArray() and getElement()/setElement(), and fromArray() has a precedent in three.js. However, fromArray() would overlap with set(), which is more economical and offers a more flexible set of overloads (in keeping with other vector operations). So I suspect this isn't the best option, but I thought I'd mention it in case it sparks any good ideas.

GregStanton avatar Oct 22 '25 14:10 GregStanton

Thanks @davepagurek and @GregStanton — this discussion has been incredibly helpful, and I feel like I now have full clarity on both the technical concerns and the API direction.

After reviewing everything, I agree that the revised method-based API is the right path forward. It balances clarity, safety, and consistency across p5 classes, while also avoiding the pitfalls of the current values property.

Here's the final API direction as I understand it:

  • Add toArray() — replaces array() with a clearer, one-way copy method
  • Add getElement(i) / setElement(i, value) — safe index-based access
  • Deprecate getValue() / setValue() — rename to getElement() / setElement()
  • Deprecate array() — in favor of toArray()
  • Remove values property — unsafe, causes state corruption
  • Keep set() — flexible overloads, no longer needs deprecation
  • Drop getElements() / setElements() — redundant with toArray()

I can begin implementing this flow, starting with:

  1. Removing the values property and its internal _values exposure
  2. Adding toArray() and updating docs/examples
  3. Renaming getValue() / setValue() with deprecation warnings
  4. Keeping set() as-is, but clarifying its role in docs
  5. Deprecating array() and pointing users to toArray()

Thanks again for the thoughtful back-and-forth — I’ve learned a lot from both of you, and I’m excited to help stabilize this part of the API for 2.0 and beyond 🚀

Ayaan005-sudo avatar Nov 09 '25 12:11 Ayaan005-sudo

Hi @Ayaan005-sudo, that is a perfect summary of the current API proposal!

Unfortunately, this API is still blocked and waiting for final approval.

Also, just a heads-up: the array() deprecation (your point 5) already has a PR from another enthusiastic volunteer. Your point about adding a warning that points to toArray() is a great idea! We can do that once this API is approved.

Thanks again for your patience, as well as your fantastic API summary!

GregStanton avatar Nov 09 '25 23:11 GregStanton

@GregStanton Thanks for the update! Totally understand that it's pending approval — I’ll hold off on implementation until it’s greenlit.

In the meantime, I’ll keep an eye on related PRs and continue reviewing the getter/setter inventory. Looking forward to helping finalize this once it’s ready!

Ayaan005-sudo avatar Nov 12 '25 04:11 Ayaan005-sudo

Hi — I'm Shubham Kahar and I support this proposal.
It improves consistency and clarity for vector users, and I think it will simplify both API usage and documentation.
Happy to help test or document once the final decision is made. 🙂

Shubhamkahar196 avatar Nov 12 '25 16:11 Shubhamkahar196

Hi everyone, thanks so much for all the lively discussion of the p5.js 2.x Vector implementation! Now that that 2.1 is released, we wanted to set up a more direct discussion space for p5.js 2.x Vector implementation bugfixes, documentation, and improvements. So, here is a Discord channel: https://discord.gg/gH3VcRKhen

As we discuss/unblock each of the vector issues, I will also follow up on those issues as a comment. So if you prefer to participate only (or primarily) on GitHub, that still also works!

ksen0 avatar Nov 14 '25 09:11 ksen0