Learning Focus Management in Smart TV apps
One of the first things you’ll need to learn when starting out your Smart TV app development journey, is focus management. While regular mobile apps and even websites can be navigated simply by touch, that’s obviously not the case with TV apps. The TV is of course sitting further away from the user. It’s why TV remotes came to life. So how do you handle that TV remote input? Enter focus management. In this blog we look at exactly that; how we can use focus management for intuitive navigation inside our TV applications.
Understanding focus and TV remote input
Before we dive into the different types and mechanisms of focus management, let’s first look at the core of TV remote input. A mouse and touch input have something specific in common. It’s very easy to scroll through pages by simply going up or down with your finger or the scroll wheel. Similarly, you can select an item on the screen by clicking on it. That part is actually easy on Smart TV too. You can often select the item that is “in focus” with the OK or ENTER button. But what exactly does it mean to have an item ‘in focus’?
At the very core of how users navigate through TV applications is exactly that: focus. It is an indicator, often with a big border and enlarged letters indicating where in the interface the user currently ‘is’. It should be clear and concise for the user to understand where they are, but also where they can navigate to next. So having that clear indication with e.g. a border is crucial.
The effect of having a focus indicator means the user can also better understand where they can go next. TV remotes have four buttons that can be used for navigation. Of course, you need to be able to go into all directions, meaning you’ll have the left, right, up and down buttons to navigate with. So when the focus indicator is on the first item on a page, and the user clicks ‘right’, you can already guess where the focus indicator should move to. That’s right, the second item, located on the right side of the first. Congrats! You’ve just learned the basics of focus and focus management. As you can imagine, we are just getting started.
Spatial Navigation
With the basics in your pocket, it’s time to jump straight to the most difficult implementation of focus management. Better get it out of the way first, right? As it says in the name, spatial navigation uses the physical space that items occupy on the screen to determine the next focusable item. That’s some very expensive words, so let me try and dumb it down a bit. With spatial navigation, you basically look for the closest item in the direction the user has clicked. So if the user clicks right, we look for the item to the right of us that’s closest to the one currently in focus. While this may sound easy, depending on your user interface you can have very complex scenarios. Let’s look at a few examples below to show you what I mean.
Example one should be rather easy, right? And it is! In this simple row/grid structure, items are placed directly next, below or above each other. Meaning when the user presses one of the navigational buttons, we don’t need to think much of where to navigate to. When focus is on A and the user clicks ‘right’, focus obviously shifts to B. Similarly from A and going down to D. But what if the UI is a bit more complex?
Okay, so we have the grid structure, only the B and E items are moved up, both occupying half of an ‘edge’ with A. So, focus is on A and the user clicks right, which item are you going to focus now? I already threw in the term just now, in this example we’ll need to look at the item with the closest ‘edge’ to our currently focused item. Which, if my quick drawings are accurate, would make both B and E equally suitable to be selected. So how do we execute these calculations exactly, what makes them both equally suitable?
Granted, there are many different implementations you can use here. What most of the world has settled on is using a so-called ‘frustum’ in your calculations. It’s a truncated cone that you can place directly at the edge of your currently focused item. Then by tracing the edges of potentially focusable elements in that frustum area, you can select the closest item.
In our example, both B and E would be potential candidates to select next. Both fall into our frustum selection area. What you do in this case is then up to you. Because both match the criteria at the exact same value (both sharing an edge at ~50%), you do need to make a decision which one you’ll give focus to. In most cases you would select the item placed the highest on the screen first, as it fits the natural flow of the user interface the best. But of course, that might depend on your exact use cases.
As you can tell, spatial navigation can quickly become very complex. So if you do go down this road, think clearly of the implications of differently-placed items, and ensure your testing scenarios cover all or at least the majority of the use cases you expect to have inside your TV app.
Directional Navigation (Focus Tree)
If spatial navigation sounds a bit too complex for you to implement and maintain, I honestly don’t blame you. Luckily there is an easier way to implement focus management. It may not be as versatile and covering all use cases out of the box as spatial navigation, but the ‘directional’ equivalent can be just as powerful, and much easier to implement.
As the word implies, directional navigation relies solely on the direction of items. As luck would have it, the interface in TV applications is usually comprised of lists, grids and rows. That would have it that in most cases items are placed neatly above, below and next to each other in an orderly fashion. And in our implementation we should take advantage of that. Looking at our earlier example, we can order our items into several sets of lists (or a grid structure, if you’d prefer that):
In our example above, we have now (as we should do in code), split up the items into different rows. Our example of A → B still exists from earlier, only we have now clarified that item B belongs to row 1, and item E belongs to row 2. So even if the position of item B and E would shift up (as in our earlier example), you would never shift focus from A to E, as they don’t belong in the same row. Focus remains in the same row when navigating left and right (for a horizontal row, it would be up and down for a vertical row). A much easier implementation, achieving a similar result as with spatial navigation.
What we have now built here, is often called a ‘focus tree’. You’re building up a set of lists, rows and grids, each containing items, and using the ‘focus tree’ to shift focus between the different types of element. Imagine the example above, you can also encapsulate Row 1 and Row 2 in yet another Row or List item. Then you might also have a menu at the top, which focus can be shifted towards. All of these end up in the focus tree, allowing for easy navigation. Let me show you what I mean:
What the images above demonstrate is the user interface how you might see it on screen, and the associated focus tree. You can see the menu and ‘list 1’ being at the same level, meaning focus only shifts between them ‘as a whole’. Going in deeper into the focus tree, focus shifts only between items at the same layer or one below above it. Understanding whether the row or list is horizontal or vertical is important to keep track of. As mentioned earlier, when a row is horizontal, shifting focus left and right means shifting focus to a different item in the row itself. Using down would give the focus back to the list, which can then distribute it to the next row. And that process of course changes to up/down inside the row, and left/right to the list when you’re working with vertical lists next to each other.
Fixed Navigation
The easiest and one that needs little words, is fixed navigation. This really only is for specific use cases and small applications. As the words suggest, fixed focus is very restricted in where users can navigate to. In simple terms, with fixed focus you explicitly tell what the next focusable item is. Looking at our row example again, with fixed navigation you would explicitly tell item A that the item to focus when the user clicks right, is item B. Similarly for item C and so on and so forth.
While it is very easy to implement, you can imagine it’s not really scalable. If you’re working with a very simple user interface, always with a fixed amount of rows, items and such, this can be an option for you. In most cases though, you will really only use this for very specific user interfaces that never change.
Floating Focus
A concept you see across applications too, albeit it’s not used as much but still important to understand, is the ‘floating focus’ concept. While it doesn’t necessarily focus (hah) too much on calculating which item to shift focus to next, it does have a lot to do with the presentation of focus. In all of our earlier examples, the focus indicator stays on the currently focused element. Looking at the earlier row example again, focus might start on item ‘A’ and then move to item ‘B’. The row itself does not move, rather, the focus indicator itself does.
That’s where floating focus comes into play, or rather, it changes the focus indicator completely. Rather than the focus indicator moving to a different position, with floating focus the row itself moves instead. The floating focus indicator ‘hovers’ above the rows and stays in place, even when the user moves left or right. Example time!
Can you guess what the focus indicator is here? I surely hope so ;). As you can see, in the first row the focus is on ‘A’. Then, in the second row, the user has pressed ‘right’ once, making the focus shift to ‘B’. But rather than the focus indicator itself moving to ‘B’, the row itself shifted and focus remains in place. The focus is floating above the row, and continues to stay there in most cases while the user is navigating across lists and rows and grids.
Concluding
Focus can be difficult to implement, I’m not going to lie about that. But if there’s anything this blog has hopefully taught you, there are different ways to go about it. While spatial navigation offers the most flexibility, it is also the hardest to implement. On the other hand, directional navigation is very easy to implement and does cover most of the use cases of TV applications. However, if the UI becomes complex, it might be difficult to use directional navigation to cover all your use cases.
If it were up to me, I would actually opt for a combination of different navigational implementations. While in most cases I might have enough with the directional approach, I might need to resort to fixed navigation for one or two specific ‘out of place’ buttons. Or I might use spatial navigation to cover those use cases, while still relying on directional for the rows and grids. I don’t want to waste any time with an algorithm looking for the next item, when I already know it’s right next to the currently focused one. The power is in the combination of implementations.
And then, of course, floating focus can be an interesting approach to visually representing focus. While most applications opt for the ‘standard’ focus approach and explicitly put it on the item that receives focus, floating focus might be worth it for your specific use cases.
As always, I hope this blog has been useful and informative. If you have any questions, feel free to reach out!