/notebook/django/models

Defining Models


Introduction

At the heart of every Django application, arguably, are the relationships between different kinds of data. This definition is handled by Django's internal models.

A model in django essentially defines the relationship between elements of a database, which is why it's so critical. Get it right, and you save time and money -- in terms of the amount of processing -- when you want work with inter-related data types. This is why Django's models.py module is so important to get right.

The models for a blog

I have made a couple architectural decisions early on that will influence the complexity supported for my markdown-driven blog application. First, let's remember what markdown is. It is a file filled with markdown-based text that will have to be processed into html at some point. But a markdown file also usually comes with a header, or preamble, written in yaml -- a type of structured data, similar to the JSON standard -- that adds extra context information. Some of the example parameters you will find in a typical yaml header include:

  • title: how you would like your blog post named. This is often provided separately than at the top of your markdown file because in so many different contexts, you want to put the title in many different spots. It needs to be available as a variable that you insert into various steps
  • tags: tags are often
  • subtitle: you may want to provide a subtitle below the title in a way that is stylized more than your typical markdown, and thus also need to provide it seprarately in the yaml preamble
  • summary: Similar to the subtitle, a summary can be provided that serves as a blurb when the post is rendered as a summary
  • image: You can often pair a post with an accompanying image. Sure, you can also embed images into the markdown content itself, but with an image specified in the file, you can format the image in a number of other various ways outside of the norm.
  • formatting information: In pandoc-flavored markdown you can include all sorts of parameters that influence page formatting, default font, etc
  • author and date: You can include more details about your post
  • custom information: But that's not all. Because yaml is a data format, you can put anything you want in there. Different markdown processors may ignore custom variable names, but that doesn't mean you can't include it. And when you're writing your own markdown processor, you really can put anything you want in your yaml preable.

This list is not exhaustive. But you get the idea -- markdown files and their accompanying yaml preambles can include a variety of information.

BUT

But they don't have to. And so here's the issue. If you create a highly specified model for your markdown-based post, with all these different variables, then you run into several problems. First, the data is now stored in two places: in the original source file, and in your database. What happens when you update the original markdown file, or database file? Does that data propagate back to the original source file, or do you have to check for conflicts? If data from the original post doesn't match the type expected in the database, how will you handle it? Do you need to massage each kind of data differently before importing it into the database? What if some but not all of your markdown files include some of the yaml variables, while others (perhaps from a different provenance) have a slightly different mix of variables? This is my problem, because I've written in markdown, for many different markdown processors, for many years. There is no standard, so I have a variety of different kinds of data hanging out in there.

So in my opinion, here is what I want to do:

Treat the database entry for a post as a mirror of what is contained in the original markdown file. If I change the markdown file, I will need to provide a quick and easy way to update the database entry for the post, but if I change the data associated with a particular post in the database, I shouldn't concern myself, for the time being at least, with propagating it back to the file. That may come later.

Second, I would only like to add variables to my model when that data will be used directly in a query, but not if it is simply going to be processed and displayed. Instead, I should just import and store some version of the yaml preamble and markdown in a way that I can access custom data if need be and process the markdown on demand, without making a separate entry for it. If I need to search through specific data using the underlying database's functionality, however, it should be imported and formatted in such a way in the database to make those queries more efficient.

Models

Post

The post model holds all the information about a blog post

  • path: the path to the source file. I will be using the file structure of the markdown folder structure to organize my site.
  • author: I will want to link each post to a specific user of my site. Most notably, me.
  • date: the date of the post -- this will be used in various searches
  • tags: like the date, I want a post to be browsable by how it's tagged. Therefore, I need to extact the tags so when you run a query on all posts with a given tag, the post will turn up
  • path categories: the way I will sort and organize the file tree of the source file structure
  • json: the original yaml preamble stored as a json string.
  • content: the original markdown source stored as a string. I will want to provide a way to search the contents of the markdown. Eventually I may want to search the formatted html instead, but for now the markdown is sufficient.
class Post(models.Model):
    path = models.CharField(max_length=200,null=True,unique=True)
    author = models.ForeignKey(User, on_delete= models.CASCADE)
    tags = models.ManyToManyField(Tag,blank=True)
    json = models.TextField(null=True)
    path_categories = models.ManyToManyField(PathCategory)
    date =  models.DateTimeField(blank=True,null= True)
    content = models.TextField(blank=True,null=True)

    def __str__(self):
        return str(self.path)

    ...

there are a number of custom functions I've added to the post model that I will talk about in other places, but we can include them here for completeness:

...

def __str__(self):
        return str(self.path)

    def get_info_url(self):
        return reverse('post-info', args=[str(self.id)])

    def get_rendered_url(self):
        return reverse('post', args=[str(self.id)])

    def get_markdown_url(self):
        path_elements = self.path.split('/')
        if len(path_elements)>1:
            header = path_elements[0]
            dirs = self.path.split('/')[1:]
            dirs[-1] = os.path.splitext(dirs[-1])[0]
            dirs_string = '/'.join(dirs)
            return reverse('test', kwargs={'subpath':dirs_string,'top_dir':header})

    def get_title_safe(self):

        my_json = json.loads(self.json)

        try:
            title=my_json['title']
        except KeyError:
            title=os.path.split(self.path)[1]
        return title

    def get_summary_safe(self):

        my_json = json.loads(self.json)

        try:
            summary=my_json['summary']
        except KeyError:
            summary=None
        return summary

Path Categories

A Path Category is essentially a relative filepath in the original markdown source directory. Any post will be a child of the folders(path categories) they are in. Including path / directory information will allow quick and easy searching for files within a given folder (and its children), or any parent of the current path you are in. The only data within a path category is its name.

class PathCategory(models.Model):
    name = models.CharField(max_length=200, unique=True)

    class Meta:
        verbose_name_plural = "path categories"

    def __str__(self):
        return str(self.name)

Tag

A tag is a unique model. Many different posts can be associated with the same tag, and many different tags can be associated with the same post. The only data in a tag is its name.

class Tag(models.Model):
    name  = models.CharField(max_length=200, unique=True)
    def __str__(self):
        return str(self.name)

Front Page

The front page has many different characteristics I will want to be able to change on the fly, including

  • featured article: the article at the top of the front page
  • recent article(s): a list of recent articles that may be included on the front page as well as linked to on the right column of posts.
  • blog-posts: the list of articles included below the featured/recent articles.

Contents

About

I am an engineer and educator, having spent ten years as a professor. My goal is to help you build your knowledge of design and technology, get your hardware working, and propel your startup or small business. Get in touch!