Samuel Cheng
Software Engineer and UX Enthusiast

This project focused on the design and implementation of a multivariate information visualization for a non-profit medical organization, Floating Doctors. Our visualization allows for the staff of Floating Doctors to quickly see the details of each patient visitation, track the most common diagnoses and treatments within a community, and find relationships between how water sources and bano play into the overall health of the community. We've partnered with Dr. LaBrot, founder of Floating Doctors throughout the course of this project.


  • August - December 2016
  • Team of 4
  • User research, iterative design, Sketch, HTMl/CSS/JS, React, D3, Bootstrap, video editing

Floating Doctors is an organization whose mission is “to reduce the present and future burden of disease in the developing world, and to promote improvements in health care delivery worldwide.” They use boats and ships to deliver health services to remote regions in Central and South America and work to bring both public health awareness and specialist medical knowledge to the developing world.

Floating Doctors provides an extensive data set for each patient visitation which includes the location and date of treatment, patient demographics (e.g. age, gender, height, weight), vitals (e.g. blood pressure, temperature), family medical history (e.g. cancer, diabetes, etc), substances (e.g. smoking, drugs, etc), pregnancy information, social/family history, patient’s water source, diagnoses, and treatments.
Users, Goals, and Data

Our users were the staff and leadership of the Floating Doctors organization. We worked to create a visualization from the provided data set that can help users achieve the following 3 goals:

  • Track changes in population health via common health indicators.
  • Monitor the progress of deployed treatments and ongoing health issues in communities.
  • Generate reports and show status to stakeholders and potential supporters.
Early on, we determined that there were too many variables within the original dataset to consider and did not want to build a system that mimicked the functionality of existing commercial tools, such as Tableau, Qlik, and Spotfire. Instead, we decided to focus on the more important or interesting attributes of the data for the users to learn about and consider. The table below contains the data attributes we focused on in this visualization along with its data type.

Therefore, we had a dataset with 5 nominal data, 2 ordinal data, and 4 quantitative data. In addition, we constrained the dataset to only patient visits within 2016 which resulted in 1639 cases.
Sample Analytic Questions/Queries

Using our visualization, we wanted the users to be able to answer the following example questions:

  • How has the diagnoses of hepatitis in La Sabana changed over time?
  • Is there a correlation between bano type and the diagnoses of cholera?
  • After introducing soap to Pueblo Nueva in March 2016, has that had an impact on the presence of influenza in that community?
  • Are children more susceptible to certain diseases than adults?
Design Ideas

After several rounds of brainstorming, ideation, and iteration, we came up with a total of seven design ideas, two of which were multi-coordinated views and five of which were “innovative” views. We will explain and discuss these ideas below.

Design 1: Dashboard

This is a traditional dashboard layout that many commercial products produce. It is useful in that it shows a large amount of information at once and is familiar to anyone used to analytics. The benefits of this is that it can be customized to exactly what the customer desires to see while offering options to choose creative interfaces for certain views. Brushing and linking is a common feature to use. However, because this type of standard multi-coordinated visualization is very common, it lacks the novelty of an interesting visualization. In addition, based on our conversation with Dr. LaBrot, Floating Doctors currently uses Qlik to create similar dashboards; hence, this particular visualization would not provide much additional value to the organization.

Design 2: Floating Dots

This design idea addresses two issues: it provides a new approach to the search function across a dataset that has a large number of variables and it takes advantage of storytelling to give the user potential interesting insights into the data.

Each data case is represented by a dot. Users can enter a name in the search bar or select a specific group of people (e.g. all women, all people of this diagnoses, all people in this community) and the corresponding dot(s) will be highlighted. Next, the system generates several views which to view the dataset while keeping that initial individual or group highlighted throughout the journey. The interface is scrollable; as the user scrolls, the dots will move around into various other views. Each view can also be explored via clicking, hovering, or brushing and linking to further give the user control over the data.

This design idea is fun for the user to navigate and given its storytelling element, is helpful to the user in leading stakeholders through a summary of the progress made. Because of the various different views possible, it incorporates a time factor where a population can be tracked over a period of time. However, too few information is shown on each screen which can be frustrating for users who may want to see various graphs on a single page.

There are a few ways to enhance this visualization; one way is to show more graphs on each screen. To better track changes, we would need to add ways display trends. To enhance the storytelling element, it was suggested that we could present a video clip and then highlight the data that relates to the video clip and the specific scene.

Design 3: Community Bars

This diagram focuses on each community over a period of time. Each colored bar represents a community and the x-axis is the range of time. Each pixel represented is a patient visitation in the community at that moment in time. Filters on the right allow the user to call out certain attributes of the dataset. Hovering over a single pixel brings up a detailed description of that visit.

This visualization allows a user to see how attributes of patients (e.g. diagnoses) change within a community over a period of time. Each pixel may take advantage of shape or size to encode a bit more information as well such as gender or age. However, several drawbacks of this design is the use of the colored bars which do not inherently have a meaning other than to differentiate between approximately thirteen communities visited in 2016. Because we expect that visits do not happen in all communities at once, the visitation data may not look like the picture above but rather be clustered as various “steps” as time progresses which leaves a lot of empty space.

This idea is similar to the concept of “semantic substrates” that may be of interest to us. The placement of pixels on the y-axis can be rationalized as pixel jitter; however, it may need to be more organized especially if many patients are visited during that time. The positioning of pixels and the use of the y-axis still needs to resolved. It was further suggested that we could use the circle size to show the number of patients. Other feedback included adding a trend line or bar graph at the top and side of the chart to show how a specific attribute of the dataset changed over a period of time.

Design 4: Bubble Diagram

This visualization groups patients into various circles. In this case, the circles can be all the diagnoses in the dataset. The size of each circle corresponds to the number of occurrences of that diagnosis. The user can zoom into various parts of the visualization to view more details. In the second picture above, there are colored circles which corresponds to communities. Each pixel within that colored circle is a patient with that diagnosis within that community. Hovering over a pixel reveals detailed information about the patient as well as other occurrences of that patient in other diagnoses circles. Lines between larger circles show the prevalence of different diagnoses having a correlation with another. The thicker the line, the higher the correlation.

This visualization is fun and is more conducive to user exploration. The user can see the state of diagnoses among patients at a moment in time. However, a possible drawback is that it doesn’t necessarily show trends over a period of time without the use of video animation to show circles shrinking or growing or pixels moving around. Depending on the data, some communities might always have a certain number of diagnoses simply because it is a larger community; hence, we would need to consider normalizing our data. Furthermore, we discussed grouping the circles by commonalities between communities as well.

Design 5: Pixel Clusters

This visualization involves throwing each patient data into a canvas. Each patient is represented by a pixel. The user can use their finger (on a tablet) or mouse to “lasso” any number of pixels and cluster them based on some attribute. In the pictures above, for example, the user may lasso the entire dataset and cluster based on gender. Then, they may choose to cluster the males by community or by a specific diagnosis.

One possible suggestion is to use “smart-clustering” techniques to group the pixels initially based on how the system decides to cluster it. This design is very exploratory and is extendable to a tablet format. In addition, it allows the user to see various proportions of the population and see the details as well as the big picture. However, it does not show the time factor of a diagnosis or treatment over a period of time nor does it show relationships between data attributes.

Design 6: Ray Scanner

This visualization involves an ordinary scatterplot of patient data where each patient is captured as a pixel. The user is able to set different x and y axis values after which the pixels will reshuffle accordingly. The most prominent feature of this design is the use of “light ray” filters to illuminate areas of the dataset on the graph. The user can click and drag a section of either axes, select a specific filter (e.g. community, diagnoses, treatment), and view the pixels in that range which match that filter. Multiple filters can be overlaid on each other.

This allows the user to explore more of the data using the filters much like a flashlight, illuminating various hidden gems. It is an innovative view but by itself, it may not offer much that a traditional dashboard would not provide. One criticism is why the user wouldn’t just illuminate the whole axes every time? Perhaps it would be possible to combine this visualization with the multi-view or one of the previous concepts such as Floating Dots. In addition, we could introduce a time variable into this visualization, which would allow this visualization to track changes and monitor progress over time.

Design 7: Interactive Radial Lines

This visualization shows a radial edge of diagnoses on the left and a radial edge of treatments on the right. The center is a plot of the number of patient visits for certain dates and communities. The red vertical lines correspond to female visits and blue vertical lines correspond to male visits. Each line is constructed with individual pixels which are the individual patients. By hovering over a patient, straight lines are illuminated that connect to the diagnoses edge and the treatment edge which shows the diagnosis and treatment that corresponds to that particular patient. The user may also hover over items in the diagnosis and treatment edges to show all the patients with that diagnosis or treatment respectively.

One consideration is that while the Floating Doctors organization is familiar with the countries and cities they visit, supporters may not be. Therefore, we could include a map of each country with consultation locations. When a user hovers over a location, the map indicates where that community is located.

This view allows the user to explore how all the diagnoses and treatments are intertwined between individuals. It is more conducive to exploring than other visualizations and the center graph can be modified using filters to support different views, such as age, while keeping the diagnoses and treatment edges the same. Further features could include the capability to select a data point, diagnosis, or treatment to use as filters, increasing the width of the bars so the cases are easier to select, and including the capability to sort from high to low, or to sort from most popular diagnosis to least, etc.

However, this does not show patient data over a period of time. Though there is a time scale involved, it is not quite the same as simply seeing a trend line. Although we can overlay a thicker and darker trend line on the center plot, doing this may make the entire visualization harder to read since a critique of this visualization is that it would have potentially too many lines.

Based on the three high-level goals, we categorized each design under the goals that best played to its strengths.

Track changes in population health via common health indicators.
  • Idea #4: Bubble Diagram
  • Idea #3: Community Bars (clustering part)
  • Idea #6: Ray Scanner
Monitor the progress of deployed treatments and ongoing health issues in communities.
  • Idea #3: Community Bars
  • Idea #7: Interactive Radial Lines
Generate reports and show status to stakeholders and potential supporters.
  • Idea #1: Dashboard
  • Idea #2: Floating Dots
Choosing an appropriate design proved exceedingly difficult yet several principles were followed. First, we did not focus too much on figuring out what design could answer all potential questions. Instead, we prioritized designs with a clear time component since tracking changes over a period of time was important to the client. Next, we leaned towards more common views if it could answer more questions rather than trying to push a very novel view especially if it couldn't answer as many questions.

Upon further deliberation and brainstorming, we decided to iterate upon our seventh design choice, Interactive Radial Lines. The resulting mockups are shown below.

This design was chosen because it was able to combine multivariable including a time axis into one view. Because visits to communities occur on various dates that could be weeks apart, the x-axis does not represent a continuous variable but rather each "tick" on the axis is a separate visit on a separate date. Each pixel is a patient encounter. Therefore, the higher the pixel bar, the more patients were treated at that community on that date. The left, right, and bottom areas hold filters corresponding to common diagnoses, treatments, water sources, bano type, and community locations.

Hovering over a pixel highlights the diagnosis, treatments, water source, bano type, and community corresponding to that particular patient. Clicking on a pixel reveals a modal dialog that shows more detailed information about the patient including the patient's name, age, community, and collected vitals such as height, weight, blood pressure, blood sugar, and hemoglobin levels. Clicking the dots on the x-axis illuminates all the patient encounters on that specific visit.

Hovering over any of the five variables (e.g. diagnosis) illuminates all the patient encounters with that varaible and sorts the other four variables by prevalence in relation to the first variable. Bars appear showing the relative proportions of the ordering. The same behavior happens when hovering over any of the five variables.

Clicking any of the five variables reveals a modal dialog containing detailed multiview graphs pertaining to that specific variable. The graphs include age distribution, average blood hemoglobin by age, average blood pressure by age, and average BMI by age. The modal dialog also contains information about the proportion of male and female patients and total number of entries. Holding the shift key and clicking other variables further filters the data. For example, if the Ensenada community is clicked followed by the diagnosis of Worms, the data shown are the patients who have worms in Ensenada.
Prototype Implementation

The dataset provided by Floating Doctors was originally in Microsoft Excel spreadsheet format excluding the translation of diagnoses and treatment codes to human-understandable terms. The first major task was cleaning the dataset and building a JSON file to import into the Firebase NoSQL database.

React.js, D3.js, Bootstrap.js along with HTML and CSS was used to build the user interface while database queries handled and cached by a Node.js backend server. Using React.js, independent, reusable components corresponding to various UI elements (e.g. a patient square, a patient bar, an axis, a line) were built and incorporated into the main view.
Evaluation and Conclusion

After completion of the prototype, a demo and evaluation of the system had to be conducted. Users found the design novel and easy to navigate. However, because of the placement of the community information, users were confused thinking that it was an x-axis label. Furthermore, it was difficult to convey that the axis was not continuous but rather discrete visitations. Perhaps a slight separation in the corner where the x and y-axis would meet would convey that this is not an axis at all. Other possibilities include using a dotted line or no line at all.

Because each square pertaining to a patient encounter was quite small, users found it difficult to select a particular square. One possible fix to this would be to implement a fisheye view (master-detail design) which would make it easier for users to click on the appropriate interface elements.

Finally, the system prototype was demoed to Dr. LaBrot of Floating Doctors who requested access to the codebase for his internal use.