Another month has passed in the blink of an eye, and overall, this month’s productivity was not high, and I am not very satisfied with myself.
Grafana and Prometheus
The first week was basically spent trying to use Grafana, in other words - slacking off.
As part of the work of a Service Engineer, ensuring that various servers are operating normally and that server loads are within a reasonable range is essential. So we need to use Prometheus and Grafana. Prometheus is a Time-series Database responsible for collecting information from various nodes and saving it in the Time-series Database. Grafana, on the other hand, is a plotting tool that queries the Time-series Database and presents the graphs.
The first task this month was to learn how to query Prometheus’ database and then modify a Grafana dashboard.
When writing the query, all metrics could not obtain values. I initially thought I was using the wrong query language, so I studied the documentation for two days. Later, I found out that the Docker container storage was full. The maximum cache capacity set by Prometheus was larger than the storage capacity allocated to the Docker container, so Prometheus kept filling the Docker image without cleaning up the previous data. So, my colleague cleared some data, but despite clearing the storage, we couldn’t update the Docker image. In the end, it turned out that the unit in the configuration was wrong - it should have been GB instead of g…
That’s how I stumbled through the first week.
Giving up Windows - Tortured by Chef Kitchen for Three Weeks
This task itself was small, but it turned out to be quite tough.
I needed to modify a Chef cookbook, but it wasn’t just about making the changes. I needed to perform tests to ensure that the modified cookbook could run stably. The difference between the company’s production level and my personal projects is significant. I could be very flexible with my projects - if something broke, I could fix it. However, with production-level items, thorough testing before release is required, and the versions must be locked to usable versions, at least to the major release version number (#.#.# - the first number).
Due to the substantial cost of the enterprise version of ChefDK (most open-source software makes money this way), Blackberry is using a version from many years ago that did not require enterprise payment. However, this also means that all related plugins need to be fixed to that older version. Therefore, locking versions is a laborious task. I spent a whole week installing Test Kitchen to test my cookbook. Overall, I felt that a lot of time was spent on seemingly redundant tasks, such as solving authentication permission issues and resolving testing problems. To some extent, I can only blame my inadequate skills. It is undeniable that in the process of solving these problems, I have indeed gained some abilities, such as familiarity with the Linux file system.
By the third week, I really wanted to give up. I asked my coop buddy if we could skip the testing step, or if he could help me with testing. But by the time he replied, I had figured out the issue on my own… It seems I shouldn’t give up so early. However, there are times when you try to solve a problem by yourself for a long time, only to find out in the end that the issue is not on your side, which is really frustrating. (I couldn’t log in to the VM during configuration no matter what. I pondered for a long time whether it was an issue with my SSH key type, only to discover that the team had not added me to the security group.)
Since many environments are for internal company use, almost all problems cannot be found online. This time I couldn’t directly paste the error message into a browser; instead, I had to follow the error message to the source code to see where the bug was. Secondly, I had to ask my colleagues, but they often take a while to provide an answer, and it may not always be correct. To get the right answer, you have to ask the right question, and you also need to look at the source code. Sometimes what appears to be an incorrect parameter type (can’t convert nil to string) is actually because I haven’t generated an SSH key on my computer; what seems to be an incorrect parameter length (length is 219, expect 21) is actually because the SSH key type is incorrect.
At the very least, I now dare to directly find the problem in the source code and can more calmly face the error message returned by the shell.
Jira - First Work Order - Project Management
In the second week, my boss gave me my first work order, using the Jira platform. It was my first time understanding what a work order really is. After my boss introduced me to the different types of work orders and I understood the Jira platform, I was actually very excited because the task management software I had used before only had a hierarchical relationship with folders. It was difficult to depict scenarios like “Task A blocked by Task B.” Jira provided a more flexible way of managing tasks, and it also allowed custom workflows. On the day my boss finished explaining, I tried to list the tasks for my personal projects using Jira. The result was that I realized I had never had a clear direction before. I was always working a bit on one feature, a bit on another feature, then realizing that some features were too complex to implement, and giving up. Jira gave me a feeling of “order,” picking up the scattered tasks, organizing them, and laying them out in front of me.
PD19 - Interview Assignment - Informational Interview for Service Engineer
For the PD19 course, I had to do an informational interview, so I interviewed my coop mentor. Here are the questions I asked and his answers, which I found quite beneficial.
- What keeps you stay in Blackberry?
The people I work with and probably just some pride in being here. I am comfortable where I am, and money is not a big driver in my life, and I still continue to learn new things.
- Why did you switch from the NOC to HELM?
I did a development opportunity with the HELM team while I was in the NOC, and a few months later Anna offered me a job with HELM, moving off shift work and joining a team that wanted me was a big deal.
- What is the most challenging problem you have resolved with your work?
The most challenging problem is always the one I don’t think I have the skills to do. I was assigned to help move our Intersect SV in April 2020 (right after we moved to work from home), and I thought it was going to be an impossible task because I did not have any Python coding skills going in. I was still able to complete it on time, and it felt really good.
- What are the key characteristics and skills (both technical and non-technical) one should have to be in this field?
Humility. I’ve found that in my experiences with other people in the field that if you think you know it all, you don’t, and you’re going to rub people the wrong way. You might always need to work with someone in the future, so it’s never good to treat someone poorly. Be humble with what you know and be able to back up what you do know. For technical skills, it’s hard to pinpoint what is “important” because it changes. Someone may say something is the most important code to know, then 10 years later, people have stopped using it. I think it’s more important to have the ability to take what you know about one technical skill and transfer whatever percentage you can to the new thing you are working on.
- What do you think is the most boring part of this job?
Meetings and paperwork. They are both important and can provide value (going back to remember what you did or writing down what you did to support your teammates), but sometimes they are very boring. Also, being in enterprise makes it possible for certain “smart” people to be toxic because we can’t afford to lose them. (This was something he told me privately later, as it wasn’t very convenient to type it out.)
- What do you get from this job? What do you find the most rewarding?
I get to help support some great security products while working with great people. The most rewarding thing is finishing work that I get assigned that I initially thought I couldn’t do.
- What would you say to your younger self? (What would you have done differently)
There are lots of lessons you’ll learn, but learn from them, or you’ll repeat them. Don’t try to compare your skills to your co-workers because they might be more skilled than you. If you think you are stupid because your co-worker is more skilled, that’s probably not true. This is something that even current-day Ian has a problem with.