Diving into the inner workings of ReactJS
My take on explaining how ReactJS works
I’ve been using ReactJS for about 2 years and I finally took some time to learn the inner workings of this amazing framework. When I was beginning in my career, all I wanted was to see things working. But now I feel more mature as a developer, therefore it’s probably a good oportunity to take a deep dive and expand my knowledge on overall javascript.
I searched through the web for blog posts and documentations to help me in this endeavor, and ended up finding some really good references. Cannot recommend Dan Abramov’s blog enough, particularly this post about React as a UI runtime.
Another reference was this document about the react fiber architecture. It’s a bit old and many things may have changed, but it makes a good jobs explaining some basic concepts.
Finally, this amazing walkthrough for building your own react helped me a lot to undestand in practice how things are working.
Enough talking, let’s get going.
React !== Host environment
Some people may think React can actually output DOM elements to a webpage, creating things like div
, h1
or p
tags.
Actually, React is a layer on top of these primitives APIs and React’s main job isn’t actually displaying things on a webpage.
The part actually responsible for displaying and outputing final elements it’s called rerenderer. And it can have many forms, meaning the outputs can vary a lot! It doens’t need to render specifically DOM elements (as a web developer is used to) but it can also render mobile native code, PDF, JSON and much more. These outputs environments are called hosts.
The rerenderer we are used to in the web environment is the ReactDOM, which outputs DOM elements, of course. But think about React Native, which outputs mobile code, as another kind of rerenderer.
React ----> ReactDOM -----> DOM
The smallest block of a host are the nodes. One example are the DOM nodes! In React, the smallest block is an element, a React element which is just a plain object:
{
type: 'h1',
props: {
className: 'blue',
children: []
},
child: [],
dom: '',
parent: '',
alternate: '',
hooks: []
}
There are lots of properties. Just remember the JSX we are use to write is just a syntax sugar for these objects.
They can also be generated by writing React.createElement syntax
Well, we talked about rerenderers and hosts, but what about React? What is his job?
What about React?
For every render there is an entry point. Think about the following syntax:
ReactDOM.render(reactElement, container);
The container
is often a div which will be the root of the application, and the whole React element will render into it.
It’s React job to make the container match the react element. So whenever the react element changes, the container (DOM) will also change
This is called Reconciliation.
Reconciliation
Reconciliation is the algorithm behind the popular term know as virtual DOM.
It’s a huge part of React’s job, optimize this algorithm to detect changes between the two trees and change what need to be changed.
There are basically two ways of doing reconciliation:
-
erase what is in the host instance and re-create the tree again
-
mantain the tree and just update it
So React has to decide between these two options, and it does by checking the type
attribute of the existing react element, seeing if it matches with the type
of the next element to render. If it matches, it means the element didn’t change, so we don’t need to delete it and we can actually reuse the same host instance (node).
So the type
and the tree position are very important when it comes to decide wheater reuse or re-create the host instance.
When rendering lists in React, it’s important to use the key
attribute. The list has many elements with the same type
, but they can come in a different order at every render. So by using the key
prop, we help React to re-order and reuse the elements, instead of destroying and creating them again. That’s why the key
should be unique identifiers.
React components
React components are just functions that render other elements. Instead of calling these functions ourselves, we handle to React the responsability to call them, at the proper time.
So instead of
Layout({ children: Article() });
We define it like:
<Layout>
<Article />
</Layout>
We have to declare the component name as capitalized because JSX identify it as a function type element, so React can handle it approppriately. When React sees a funcion element, it goes recursively to see what this component wants to render. It will descending the tree until it has all the nodes to render.
reconcilation is a recursive feature!
Scheaduling and the Call Tree
React also aims to determine when a work update should be performed. It can delay updates or change the priority of an update fo example.
When you start a React app, we declare component functions and react puts them in a call tree. React keeps track of a call tree, similar to the call stack in javascript (but with main differences). React has to know what components to call, in which order and their respective states.
[<App />, <Layout />, <Header />, <Footer />];
Since we are dealing with UI components and not just regular functions, if there is too much work the page can be slow and display a bad UX. Or the components in the stack can be stale.
There are two nice APIs react used to use:
-
requestIdleCallback: scheadule a low priority function to be called during the browser idle period
-
requestAnimationFrame: scheadule high priority functions to be called on the next frame
React don’t use these APIs anymore. Instead, they use a scheduler package they’ve built. See here
But for learning purposes, we can stick with requestIdleCallback
In order to use theses APIs for scheduling, we need to break the call stack into multiple incremental units. That’s what React Fiber does, so we can manage a single fiber, like a single virtual stack frame.
Unlike the call stack, React don’t destroy the element after it is called. The element lives inside this frame called fiber. The fiber is only destroyed if the reconciler tells to.
A fiber is a javascript object, the same we presented in the beginning of this article:
{
type: 'h1',
props: {
className: 'blue',
children: []
},
child: [],
dom: '',
parent: '',
alternate: '',
hooks: []
}
About some of its properties:
-
type: describe the component. When the component is from a host environment like DOM, for example
div
,h1
, the type isstring
. When is a composite component like<Layout />
the type isfunction
-
key: in a list, the key helps the reconciler decide if the fiber can be reused
-
child: point to other fiber, a child fiber
-
return: also point to a fiber, the fiber the program should return after processing the current fiber. It’s basically the parent fiber
-
pending props and memoized props: these are props passed to the component. Pending props are set in the beginning of execution and memoized props are set in the end. When
pendingProps === memoizedProps
, the last fiber can be reused -
pendingWorkPriority: determines the priority of work. The scheaduler will use it to determine the next unit work to perform. Large numbers indicates low priority.
-
alternate: can correspond to two kinds of fibers: the latest fiber flushed (rendered) or the currently work-in-progress fiber
-
output: the leaf nodes, specific to the host environment (for DOM would be div, span, h1…). Outputs are created only by the host and will be given to the rerenderer.
About useState
The order you declare your useState()
functions are very important. When declaring a useState, we are putting the value (eg. count
) and the setter (eg. setCount
) in some sort of array. Then we define a cursor for them, some kind of index number. So every time we update the component or set a new value for the state, React get the cursor (index) and use it to reference/retrieve the right value/setter. It all boils down to this index number.
By default, when a parent scheadule an update with setState
, all child in the subtree is reconciled, because React doesn’t know if the parent’s updates changes the children. That’s why we can use the memo
function to memoize the child components, so it’s only reconciled when the props change (by doing a shallow comparisson).
React also batches updates. Imagine that by clicking a div, we update both parent and child. In this case the child would be updated twice: one for itself and another as an effect for the parent update. So by batching updates, React updates only once.
for avoiding stale states, use the setState functional syntax, or use the useReducer pattern with dispatch
Talking about render before talking about effect
Whenever we update state, React calls our component again. Each render result sees the updated state value, which is a constant inside our function.
So the main thing is: a react state is not a watcher, not a data binding. It’s just regular data, updated in each render (and also isolated between renders).
Just like any other Javascript function, each call has it’s own scope. Therefore, it doens’t matter how many times the component renders, if we console.log a state value, it will capture the value at the moment it was called (and not necessarelly the latest value)
This principle serves for event handlers too. Each render has it’s own version of eventhandlers, and each version remembers its own state.
Inside any particular render, props and state forever stay the same.
Each render has its own effects
function Counter() {
const [count, setCount] = useState(0);
useEffect(() => {
document.title = `You clicked ${count} times`;
});
}
The same principles of what we talked above, the effect function is different on each render, and uses props and states from that particular moment.
effects run after the component renders into the DOM and the browser paints the screen
Effects rely on javascript closures to reference to values. Closures are great when the values never change, and as we saw before, the values (props and state) never change within a particular render.
Be aware: sometimes you will want to reference the latest value from a state or prop. For this, we can use ref
to mutate the value:
function MessageThread() {
const [message, setMessage] = useState('');
// Keep track of the latest value, by mutating the ref
const latestMessage = useRef('');
useEffect(() => {
latestMessage.current = message;
});
const showMessage = () => {
alert('You said: ' + latestMessage.current);
};
}
But reading a future value from a past render is a break in paradigm (not necessarly wrong though)
The cleanup phase
The previous effect is only cleaned up after the re-render with new props.
-
React renders UI for a new value.
-
The browser paints and the user see the new value.
-
Then react cleans up the effect for the previous value
-
Then runs the effect for the new value.
So we may ask: how react can still see the “old previous value” even after it renders the new value?
Remember there is no concept of old and new values. Effects, event handlers, timeout, they only see props and state from the particular render they are in
So the cleanup just reads the props that belongs to the render it is defined in. So yes, the old props are still there if our code needs them.
It’s all about synchronize stuff
React synchronizes the DOM according to our current props and state. Similary, useEffect let you sync things outside the React tree, according to our props and state. So there is no difference in mounting/updating/unmounting, it’s all about sync.
If you’re trying to write an effect that behaves differently depending on whether the component renders for the first time or not, you’re swimming against the tide!
Difffing effects
We pass the array dependency as second argument to useEffect so we tell React that our effect only uses those values from the render scope, and nothing else. So React only needs to synchronize the effect for those values.
Don’t lie to React by omitting dependencies the effect actually uses
Always specify the values your effect uses. It helps avoiding bugs.
But we can simplify things. If for example, we rely on a count
variable to update the state inside a effect:
useEffect(() => {
const id = setInterval(() => {
setCount(count + 1);
}, 1000);
return () => clearInterval(id);
}, [count]);
We could get rid of the dependency count
by doing:
useEffect(() => {
const id = setInterval(() => {
setCount((c) => c + 1);
}, 1000);
return () => clearInterval(id);
}, []);
It helps to send only the minimal necessary information from inside the effects into a component
However, setCount(c => c + 1)
is not great either, since is very limited. A more powerful resource is useReducer
.
const initialState = {
count: 0,
step: 1,
};
function reducer(state, action) {
const { count, step } = state;
if (action.type === "tick") {
return { count: count + step, step };
} else if (action.type === "step") {
return { count, step: action.step };
} else {
throw new Error();
}
}
const [state, dispatch] = useReducer(reducer, initialState);
const { count, step } = state;
useEffect(() => {
const id = setInterval(() => {
dispatch({ type: "tick" }); // Instead of setCount(c => c + step);
}, 1000);
return () => clearInterval(id);
}, [dispatch]);
React guarantees the dispatch function to be constant throughout the component lifetime. So the example above doesn’t ever need to resubscribe the interval. We might omit dispatch from the deps, because they’ll always be static. But it doesn’t hurt to specify it.
The example above is good because it decouples the effect from the state update. Our effect doesn’t care how we update the state, it just indicates what happened.
A reducer can get access to the new component props, because When we call dispatch, React call your reducer during the next render. So the new props are already in scope to be used.
This is why I like to think of useReducer as the “cheat mode” of Hooks. It lets me decouple the update logic from describing what happened. This, in turn, helps me remove unnecessary dependencies from my effects and avoid re-running them more often than necessary.
Functions inside effects
If we need to call some function inside an effect (for example to fetch some data), we can define the function (or functions) inside the effect. This is helpfull because we don’t have to keep thinking about what dependencies to insert into the effect, since a function can call another function that uses a state.
By defining the functions directly inside the effect, we keep track of the dependencies being used ina clear way:
function SearchResults() {
const [query, setQuery] = useState("react");
useEffect(() => {
function getFetchUrl() {
return "https://hn.algolia.com/api/v1/search?query=" + query;
}
async function fetchData() {
const result = await axios(getFetchUrl());
setData(result.data);
}
fetchData();
}, [query]); // ✅ Deps are OK
// ...
}
The design of useEffect forces you to notice the change in our data flow and choose how our effects should synchronize it — instead of ignoring it until our product users hit a bug.
If you don’t want or can’t put a function inside the effect
Let’s say you have a function used by two effects in your component. You don’t want to copy paste inside two effects, and you can’t put them as dependency (because function, like any other values, changes on every render so would be useless to put it into the dependency array). You can either:
- First of all, if a function doesn’t use anything from the component scope, you can hoist it outside the component and then freely use it inside your effects:
// ✅ Not affected by the data flow
function getFetchUrl(query) {
return "https://hn.algolia.com/api/v1/search?query=" + query;
}
function SearchResults() {
useEffect(() => {
const url = getFetchUrl("react");
// ... Fetch data and do something ...
}, []); // ✅ Deps are OK
useEffect(() => {
const url = getFetchUrl("redux");
// ... Fetch data and do something ...
}, []); // ✅ Deps are OK
// ...
}
- Wrap it in a useCallback hook:
function SearchResults() {
const [query, setQuery] = useState("react");
// ✅ Preserves identity when its own deps are the same
const getFetchUrl = useCallback(
(query) => {
return "https://hn.algolia.com/api/v1/search?query=" + query;
},
[query]
); // ✅ Callback deps are OK
useEffect(() => {
const url = getFetchUrl("react");
// ... Fetch data and do something ...
}, [getFetchUrl]); // ✅ Effect deps are OK
useEffect(() => {
const url = getFetchUrl("redux");
// ... Fetch data and do something ...
}, [getFetchUrl]); // ✅ Effect deps are OK
// ...
}
useCallback add another layer of dependency checks. It makes the function only change when necessary
So in the example above: if the query
state changes, the getFetchUrl
function will change. Therefore, the effects using this function will also re-run and refetch data.
Just use this useCallback pattern when you have to pass functions down to childs or in a case like above. Don’t use it everywhere.
Race condition in effects
If we have a situation where we could have rece conditions (like 2 requests made simultaneously and the response comming in the wrong order), we could do the following to avoid it:
function Article({ id }) {
const [article, setArticle] = useState(null);
useEffect(() => {
let didCancel = false;
async function fetchData() {
const article = await API.fetchArticle(id);
if (!didCancel) {
setArticle(article);
}
}
fetchData();
return () => {
didCancel = true;
};
}, [id]);
// ...
}
Review of React Fiber
-
When render a React app, a tree of nodes is saved in memory
-
There is a render phase, when React calls component elements and perform reconcilation
-
Then there is a commit phase, when the tree is flushed to the rendering environment (DOM, IOS, Android…)
-
When the app is updated, a new node tree is generated
-
The new tree is diffed and compared to the old tree, and either updated or replace the elements